[dipu] Add kunlunxin backend (#556)

* add kunlunxin backend * add kunlunxin device * update copy_ for kunlunxin * lcy/clang-tidy (#483) * fix namespace declaration format * update diopi_functions.yaml * update clang-tidy * update clang-tidy * change tab into spaces * allow const_cast * fix bug * fix comment * fix comments * fix comments * [FIX] fix virtual memory error of using SUPA (#468) * [FIX] fix virtual memory of SUPA * [FIX] fix incorrect copy * [FIX] remove useless copy and add missing 'supa'in cmakelists.txt * make conv2d out at right memory-format (#502) * [dicp][ascend] add fusion switch file for ascend (#512) * [dipu] Speedup profiler ctor when not enabled (#526) * speedup profiler ctor * clean & format include * [DIPU]clang-tidy_shanhang (#516) * Create main readme * Update readme.md * Update readme.md * Update readme.md * add clone kineto for dicp (#457) add clone kineto for dicp * [dicp][ascend] infer op result_info (#448) * finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test * repeal modification to diopi * modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result' * fix a bug in get_cast_dtype: type(int+bool) should be int * clean code format * fix gettupleelem in topsgraph --------- Co-authored-by: jinminxi104 <[email protected]> * Fdy/enhance copy (#430) * mv vopy file path * add new copy * fix static param err * fix copy err * fix direct copy bug * rm unused bcast template name * change clang format * change name hpp * rm unused header file * remove unused header 2 * change override behavior * change comment * change cudacopy * fix d2d copy err * change register to use autogen * revert incorrect format * config fallback * fix link err * fix comment wanglei * add newline * fix cpu copy err * add camb vendor copy * fix copy err * fix copy err 2 * fix compile err * fix lingjie comment1 * fix caikun comment * fix camb ci * fix camb ci * fix device switch err * fix ling jie caikun comment 2 * fix comment incorrect local ref * change init copy * update DIOPI submodule (#458) * update DIOPI submodule * diopi update to main * update mmcv version * update submodule * update mmcv commit id * feat: pass CMAKE_BUILD_TYPE into DIOPI (#428) * [dipu] Fix copy_ fallback of topsrider. (#477) * [dicp][tops] Add dicp ci of tops. (#469) * Add dicp ci of tops. * Fix dicp ci of tops. * fix recycle dep (#474) * Fdy/fix copy tidy (#471) * fix tidy 0 * fix clang tidy copy * fix lingjie comment * add tidy msg * fix lint comment * fix format * add copy right * fuj/ add ceil.out (#480) * add ceil.out * add floor_ and cases for floor_, ceil and ceil_ * [dipu] tidy some source files and update nv build script (#453) * fix: tidy some source files - and also update build nv script * fix: make clang-format v16 happy * fix: make clang-format v16 happy * fix: remove usings and simplify some code * fix: remove index * fix: remove initialized_ * fix: add keyword VERSION * fix: remove VERSION 3.25 as CI is using CMake 3.22 * add 910B CI && remove 910 CI && update DIOPI (#481) * add 910b * add 910b * add 910b * add 910b * add resnet50 * fix bugs * fix bugs * fix bugs * fix bugs * fix bugs * rm nouse code * update DIOPI submodule (#458) * update DIOPI submodule * diopi update to main * update mmcv version * update submodule * update mmcv commit id * feat: pass CMAKE_BUILD_TYPE into DIOPI (#428) * [dipu] Fix copy_ fallback of topsrider. (#477) * [dicp][tops] Add dicp ci of tops. (#469) * Add dicp ci of tops. * Fix dicp ci of tops. * fix recycle dep (#474) * rm 910 ci * update diopi * rm 910 --------- Co-authored-by: wugeshui <[email protected]> Co-authored-by: CyCle1024 <[email protected]> Co-authored-by: Peter Ye <[email protected]> Co-authored-by: wiryls <[email protected]> Co-authored-by: yaofengchen <[email protected]> Co-authored-by: fandaoyi <[email protected]> Co-authored-by: wugeshui <[email protected]> * [dipu]add ascend profiler (#476) * add ascend profiler * support with_stack * code format * fix clang tidy * optimize naming * optimize naming * add dipu ci on dicp (#488) * [dicp][ascend] fix ascend mm/bmm on 910B (#482) * mock torch.cuda.XXXTensor (#462) * mock torch.cuda.XXXTensor * add newline at end of file * fix conflict * fix format * fix format * fix comment * Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486) * Fix `multiprocessing.Process` tests not collected by coverage and gcov * fix --concurrency=multiprocessing * [dipu] update tidy configuration and remove if-constexpr in C++14 (#470) * fix: update tidy config and remove if-constexpr * fix: it should be a list instead of bool value * feat: update clangd config * fix: move the comment out of yaml scalar * docs: add comments * fix: add DeviceIndex * fix: add some checks for headers * feat: update .clang-tidy * add profiler readme (#489) * add profiler readme * Update readme.md * update * Update readme.md * Update readme.md * Update readme.md --------- Co-authored-by: caikun-pjlab <[email protected]> * [dicp][tops] support outputs with inplace copy (#440) * add dipu stream synchronize. * adjust some ops. * fix some paras error and rename device name. * unset keep_inference_input_mutations. * fix paras error in conversion. * fix para dtype conversion. * fix empty output and inplace copy of input paras in optimizer case. * remove inplace output gen_empty_tensor. * Ywt/fix autocompare compile error (#492) * pass string to python * disable _amp_foreach_non_finite_check_and_unscale_ autocompare * [dipu] Wx/support the test for llm inference (#454) * add one iter for llm * add bert ci using the correct transformers repository * add test for the inference of llama 7b using the transformers repository * one iter test for traditional models by default * fix bug * add test for the inference of internlm 7b using the transformers repository * test for torch_dipu * set device check args other for maximum.out * fix the partition arg parsing bug on cuda * test the setting of CUDA_PARTITION * fix the bug of setting CUDA_PARTATION * add llm * add llm * optimize the selection of model list * set pythonpath for torch_dipu * test * fix bug in the command of setting pythonpath --------- Co-authored-by: wugeshui <[email protected]> * [DIPU]Wx/check the status of build dipu (#490) * check the status of build dipu on camb and nv * add check for ascend * fix the bug of pipe * [DIPU] Wx/add schema for logical or and logical not ops (#484) * add schema for logical or and logical not ops * fix bug and add test cases for these ops * add the test case: out is empty tensor * [dicp][ascend] infer op resinfo (part 2) (#491) * fix a bug in get_cast_dtype: type(int+bool) should be int * clean code format * finish res_op_infer for more simple operators * Update operator.py delete some unnecessary print() * Update operator.py clean code * finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems * clean code format * Update warning message output in operator.py * extract common function for general binary and unary operator ,add op bmm's inference * Update ascend_op.py delete unuse param * update DIOPI submodule (#485) * update DIOPI submodule * update submodule * temporily forbid resnet50 * move the testing code to dir under torch_dipu (#465) * move the testing code to dir under torch_dipu * fix a little bug * create two soft link to avoid import torch_dipu too early. * add one more soft link file to solve bugs. * support dev fork ci (#496) * support dev fork ci * [dipu] add markdownlint and update most markdown files (#493) * doc: update docs and add markdownlint * doc: rename readme.md to README.md * fix: remove MD013 * doc: format * [dicp][tops] Support some ops for stable-diffusion. (#467) * Add sin, cos, erf, split. 1. Generalize MakeTuple in tops_op. 2. Generalize make_const in enflame codegen. 3. Add sin, cos, erf, split for tops. 4. Format Python code in dicp tops. * refine code * fix abs test path * clean up code of split. * adjust const op generation. * fix nullptr case in const generation. --------- Co-authored-by: jinminxi104 <[email protected]> Co-authored-by: Reinerzhou <[email protected]> * [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494) * improve maximum schema due to the case in the inference of internlm * fix bug according to comments * fix bug * [both] fix, format and remove spaces in README.md (#497) * doc(readme): fix, format and remove spaces * fix: typo and try auto-correct * feat(ci): add autocorrect into ci * fix: remove autocorrect form ci as it's not ready * update env python 3.10 (#503) * fix clang tidy * [dicp][ascend] get soc_version from aclrt (#505) * fix clang tidy * fix format * fix format --------- Co-authored-by: MiaoYYu <[email protected]> Co-authored-by: wugeshui <[email protected]> Co-authored-by: Juntao Chen <[email protected]> Co-authored-by: jinminxi104 <[email protected]> Co-authored-by: fandaoyi <[email protected]> Co-authored-by: Peter Ye <[email protected]> Co-authored-by: wiryls <[email protected]> Co-authored-by: yaofengchen <[email protected]> Co-authored-by: Fu Jingguo <[email protected]> Co-authored-by: hellozmz <[email protected]> Co-authored-by: wugeshui <[email protected]> Co-authored-by: CyCle1024 <[email protected]> Co-authored-by: caikun-pjlab <[email protected]> Co-authored-by: tangzhiyi11 <[email protected]> Co-authored-by: wyz5864 <[email protected]> Co-authored-by: Lingjie <[email protected]> Co-authored-by: Joyce YU <[email protected]> Co-authored-by: Reinerzhou <[email protected]> Co-authored-by: POI-WX <[email protected]> Co-authored-by: HuayiL <[email protected]> Co-authored-by: Reinerzhou <[email protected]> Co-authored-by: liwenjian-sensetime <[email protected]> Co-authored-by: shanhang <[email protected]> * Speedup dumpOnArgLevel by using lazy initialization (#524) * [dicp][ascend] fuse transpose/mm in ascendgraph (#523) * [dicp][ascend] remove unnecessary broadcast (#527) * update kineto (#530) * [dicp][ascend] opt inplace copy (#533) * opt copy inplace * optimzer load_and_run * remove chech return value if (#534) * [dipu] Optimize `getAllocator` by adopting lookup table (#532) * [dipu] Optimize `getAllocator` by adopting lookup table * fix typos & clean includes * resolve comments * shrink lookup table & speedup devproxy::getDeviceCount * Op preference mem format (#525) * add memory perference in op for camb. This change will add a TAG in diopi_functions.yaml and the autogen will replace it with the prefered memory format depending on the convert_config.yaml of the device * fix bug found in ci running * improve the code according to the comment. * improve code format. * improve CMakeLists.txt code. * lyp_clang_tidy: warning uint64_t->int (#518) * clang_tidy:torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp CorrelationIDManager.h * clang_tidy dipu/torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp .h * clang_tidy:torch_dipu/csrc_dipu/profiler/profiler.cpp * clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp * clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp --v2 * clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp * clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp -v2 * clang_tidy: dipu/torch_dipu/csrc_dipu/runtime/core/DIPUEvent.h * clang_tidy: torch_dipu/csrc_dipu/profiler/profiler.h --v2 * clang_tidy: torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp --v2 * clang_tidy: torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp .h --v2 * clang_tidy: magic number; const_cast * clang_tidy: fix some review issus * clang_tidy: modify format by using run_format.sh * [dipu] fix: `torch.prod` int type promotion (#541) `prod` (and other reduction ops) should promote int type (including `bool`) to `int64` when `dtype` is not explicitly provided. Only `prod` (without `dim`) should be taken care of, because the other cases are already correctly handled in PyTorch. * [dipu] fix typo PREFERED -> PREFERRED (#545) * [dicp][ascend] add dicp ci for ascend (#540) * disable autocompare for _amp_foreach_non_finite_check_and_unscale_ (#543) * Update QuickStart.md * revert unnecessary changes * fix linter erros and implement getRuntimeVersion&getDriverVersion for kunlunxin * change device from XPU to KLX * fix build * remove uused code * use DIPU_LOG install of printf * change kunlunxin device key from xpu to klx --------- Co-authored-by: Chengyuan Li <[email protected]> Co-authored-by: Aaron <[email protected]> Co-authored-by: wyz5864 <[email protected]> Co-authored-by: tangzhiyi11 <[email protected]> Co-authored-by: Lingjie <[email protected]> Co-authored-by: ustclight-sls <[email protected]> Co-authored-by: MiaoYYu <[email protected]> Co-authored-by: wugeshui <[email protected]> Co-authored-by: Juntao Chen <[email protected]> Co-authored-by: jinminxi104 <[email protected]> Co-authored-by: fandaoyi <[email protected]> Co-authored-by: Peter Ye <[email protected]> Co-authored-by: wiryls <[email protected]> Co-authored-by: yaofengchen <[email protected]> Co-authored-by: Fu Jingguo <[email protected]> Co-authored-by: hellozmz <[email protected]> Co-authored-by: wugeshui <[email protected]> Co-authored-by: CyCle1024 <[email protected]> Co-authored-by: caikun-pjlab <[email protected]> Co-authored-by: Joyce YU <[email protected]> Co-authored-by: Reinerzhou <[email protected]> Co-authored-by: POI-WX <[email protected]> Co-authored-by: HuayiL <[email protected]> Co-authored-by: Reinerzhou <[email protected]> Co-authored-by: liwenjian-sensetime <[email protected]> Co-authored-by: shanhang <[email protected]> Co-authored-by: lyp-liuyipeng <[email protected]> Co-authored-by: zhaochaoxing <[email protected]>
DeepLink-org · Dec 22, 2023 · d032a06 · d032a06
1 parent 155eea3
commit d032a06
Show file tree

Hide file tree

Showing 10 changed files with 438 additions and 1 deletion.
diff --git a/dipu/CMakeLists.txt b/dipu/CMakeLists.txt
@@ -19,6 +19,7 @@ list(APPEND DEVICE_ASCEND "ASCEND" "ascend")
 list(APPEND DEVICE_TOPSRIDER "TOPS" "tops" "TOPSRIDER" "topsrider")
 list(APPEND DEVICE_SUPA "SUPA" "supa")
 list(APPEND DEVICE_DROPLET "DROPLET" "droplet")
+list(APPEND DEVICE_KUNLUNXIN "kunlunxin" "klx")
 
 execute_process(COMMAND git rev-parse --short HEAD
                 OUTPUT_VARIABLE DIPU_GIT_HASH)
@@ -50,6 +51,10 @@ elseif (${DEVICE} IN_LIST DEVICE_DROPLET)
   set(USE_DROPLET ON)
   set(UsedVendor droplet)
   set(DIOPI_IMPL_OPT "droplet")
+elseif (${DEVICE} IN_LIST DEVICE_KUNLUNXIN)
+  set(USE_KUNLUNXIN ON)
+  set(UsedVendor kunlunxin)
+  set(DIOPI_IMPL_OPT "kunlunxin")
 else()
   message(FATAL_ERROR "No implementation module is compiled, cmake requires option -DDEVICE=CAMB or CUDA or ASCEND or SUPA")
 endif()

diff --git a/dipu/scripts/autogen_diopi_wrapper/diopi_functions.yaml b/dipu/scripts/autogen_diopi_wrapper/diopi_functions.yaml
@@ -2436,7 +2436,7 @@
 - schema: copy_(Tensor(a!) self, Tensor src, bool non_blocking=False) -> Tensor(a!)
   dummy_call_diopi: True
   custom_fallback: True
-  device: [cuda, camb, ascend, droplet, supa]
+  device: [cuda, camb, ascend, droplet, supa, kunlunxin]
   custom_code_at_the_beginning: |
     dipu::getDipuCopyInstance()->run(self, src, non_blocking);
     return self;

diff --git a/dipu/torch_dipu/csrc_dipu/runtime/device/basedef.h b/dipu/torch_dipu/csrc_dipu/runtime/device/basedef.h
@@ -38,6 +38,7 @@ enum class VendorDeviceType : enum_t {
   GCU,      // gcu
   SUPA,     // Biren
   DROPLET,  // droplet
+  KLX,      // Kunlunxin
 };
 
 enum class EventStatus : enum_t { PENDING, RUNNING, DEFERRED, READY };

diff --git a/dipu/torch_dipu/csrc_dipu/utils/helpfunc.hpp b/dipu/torch_dipu/csrc_dipu/utils/helpfunc.hpp
@@ -20,6 +20,8 @@ constexpr const char* VendorTypeToStr(VendorDeviceType t) noexcept {
       return "SUPA";
     case VendorDeviceType::DROPLET:
       return "DROPLET";
+    case VendorDeviceType::KLX:
+      return "KLX";
   }
   return "null";
 }

diff --git a/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/CMakeLists.txt b/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/CMakeLists.txt
@@ -0,0 +1,14 @@
+cmake_minimum_required(VERSION 3.14)
+
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
+
+include(cmake/FindKLXRuntime.cmake)
+
+message(STATUS XPURT_INCLUDE_DIR ${XPURT_INCLUDE_DIR})
+
+set(VENDOR_INCLUDE_DIRS ${VENDOR_INCLUDE_DIRS} ${XPURT_INCLUDE_DIR} ${XDNN_INCLUDE_DIR} PARENT_SCOPE)
+set(VENDOR_LIB_DIRS ${VENDOR_LIB_DIRS} ${XPURT_LIBRARIES} ${XDNN_LIBRARIES} PARENT_SCOPE)
+#set(DIPU_VENDOR_LIB ${DIPU_VENDOR_LIB} xpurt xpuapi PARENT_SCOPE)
+
+file(GLOB SRC_FILES  *.cpp)
+set(VENDOR_FILES  ${SRC_FILES} PARENT_SCOPE)
diff --git a/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/KLXGeneratorImpl.cpp b/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/KLXGeneratorImpl.cpp
@@ -0,0 +1,29 @@
+#include <ATen/Functions.h>
+#include <ATen/Utils.h>
+
+#include <csrc_dipu/runtime/core/DIPUGeneratorImpl.h>
+#include <csrc_dipu/runtime/core/DIPUGuard.h>
+#include <csrc_dipu/runtime/device/deviceapis.h>
+
+namespace dipu {
+
+// Discriminate floating device type.
+// static bool is_floating_device = true;
+
+// just an example
+// not implemented now
+class KLXGeneratorImpl : public dipu::DIPUGeneratorImpl {
+ public:
+  KLXGeneratorImpl(at::DeviceIndex device_index)
+      : dipu::DIPUGeneratorImpl(device_index) {}
+
+  void set_state(const c10::TensorImpl& state) override {}
+
+  void update_state() const override {}
+};
+
+const at::Generator vendorMakeGenerator(at::DeviceIndex device_index) {
+  return at::make_generator<KLXGeneratorImpl>(device_index);
+}
+
+}  // namespace dipu
diff --git a/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/cmake/FindKLXRuntime.cmake b/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/cmake/FindKLXRuntime.cmake
@@ -0,0 +1,49 @@
+set(XPURT_TOOLKIT_ROOT /workspace/baidu/personal-code/diopi/xpu_toolchain/xpurt)
+set(XDNN_TOOLKIT_ROOT /workspace/baidu/personal-code/diopi/xpu_toolchain/xdnn)
+
+include(FindPackageHandleStandardArgs)
+
+## xdnn
+find_path(XDNN_INCLUDE_DIR
+    NAMES xpu/xdnn.h
+    HINTS ${XDNN_TOOLKIT_ROOT}/include
+          $ENV{XDNN_TOOLKIT_ROOT}/include
+)
+message("XDNN_INCLUDE_DIR:" ${XDNN_INCLUDE_DIR})
+find_library(XDNN_LIBRARIES
+    NAMES xpuapi
+    HINTS ${XDNN_TOOLKIT_ROOT}/so
+          $ENV{XDNN_TOOLKIT_ROOT}/so
+)
+message("XDNN_TOOLKIT_ROOT: " ${XDNN_TOOLKIT_ROOT})
+message("XDNN_LIBRARIES:" ${XDNN_LIBRARIES})
+if(NOT XDNN_INCLUDE_DIR OR NOT XDNN_LIBRARIES)
+    message(FATAL_ERROR "Cannot find Xdnn TOOLKIT for kunlunxin, set ENV 'XDNN_TOOLKIT_ROOT' correctly")
+endif()
+
+## runtime
+find_path(XPURT_INCLUDE_DIR
+    NAMES xpu/runtime.h
+    HINTS ${XPURT_TOOLKIT_ROOT}/include
+          $ENV{XPURT_TOOLKIT_ROOT}/include
+)
+message("XPURT_INCLUDE_DIR:" ${XPURT_INCLUDE_DIR})
+find_library(XPURT_LIBRARIES
+    NAMES xpurt
+    HINTS ${XPURT_TOOLKIT_ROOT}/so
+          $ENV{XPURT_TOOLKIT_ROOT}/so
+)
+message("XPURT_LIBRARIES:" ${XPURT_LIBRARIES})
+if(NOT XPURT_INCLUDE_DIR OR NOT XPURT_LIBRARIES)
+    message(FATAL_ERROR "Cannot find XPURT TOOLKIT for kunlunxin, set ENV 'XPURT_TOOLKIT_ROOT' correctly")
+endif()
+
+find_package_handle_standard_args(XPURT DEFAULT_MSG
+    XPURT_INCLUDE_DIR
+    XPURT_LIBRARIES)
+
+find_package_handle_standard_args(XDNN DEFAULT_MSG
+    XDNN_INCLUDE_DIR
+    XDNN_LIBRARIES)
+
+mark_as_advanced(XPURT_INCLUDE_DIR XPURT_LIBRARIES XDNN_INCLUDE_DIR XDNN_LIBRARIES)
diff --git a/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/communicatorimpl.cpp b/dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/communicatorimpl.cpp
@@ -0,0 +1,81 @@
+#include <stdexcept>
+#include <string>
+
+#include <c10/core/ScalarType.h>
+#include <torch/csrc/distributed/c10d/Types.hpp>
+
+#include <csrc_dipu/common.h>
+#include <csrc_dipu/runtime/device/diclapis.h>
+
+namespace dipu {
+
+namespace devapis {
+
+const int DICL_UNIQUE_ID_BYTES_SIZE = 0;
+
+DIPU_API diclResult_t diclGetCommAsyncError(diclComm_t comm) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclGetUniqueId(pcclUniqueId* uniqueId) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclCommInitRank(diclComm_t* comm, int nranks,
+                                       pcclUniqueId uniqueId, int rank,
+                                       int localDeviceId) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclCommDestroy(diclComm_t comm) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclAllReduce(const void* sendbuff, void* recvbuff,
+                                    size_t count, at::ScalarType datatype,
+                                    const ReduceOp& reduceOp, diclComm_t comm,
+                                    deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclBroadcast(const void* sendbuff, void* recvbuff,
+                                    size_t count, at::ScalarType datatype,
+                                    int root, diclComm_t comm,
+                                    deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclAllGather(const void* sendBuf, void* recvBuf,
+                                    size_t count, at::ScalarType datatype,
+                                    diclComm_t comm, deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclReduce(const void* sendbuff, void* recvbuff,
+                                 size_t count, at::ScalarType datatype,
+                                 const ReduceOp& reduceOp, int root,
+                                 diclComm_t comm, deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclReduceScatter(
+    void* sendBuf, void* recvBuf, size_t recvCount, at::ScalarType datatype,
+    const ReduceOp& reduceOp, diclComm_t comm, deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclSend(void* sendbuff, size_t count,
+                               at::ScalarType datatype, int peer,
+                               diclComm_t comm, deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+DIPU_API diclResult_t diclRecv(void* recvbuff, size_t count,
+                               at::ScalarType datatype, int peer,
+                               diclComm_t comm, deviceStream_t stream) {
+  return DICL_ERR_UNDEF;
+}
+
+}  // end namespace devapis
+
+}  // end namespace dipu