Skip to content

Commit

Permalink
[dipu] Add kunlunxin backend (#556)
Browse files Browse the repository at this point in the history
* add kunlunxin backend

* add kunlunxin device

* update copy_ for kunlunxin

* lcy/clang-tidy (#483)

* fix namespace declaration format

* update diopi_functions.yaml

* update clang-tidy

* update clang-tidy

* change tab into spaces

* allow const_cast

* fix bug

* fix comment

* fix comments

* fix comments

* [FIX] fix virtual memory error of using SUPA (#468)

* [FIX] fix virtual memory of SUPA

* [FIX] fix incorrect copy

* [FIX] remove useless copy and add missing 'supa'in cmakelists.txt

* make conv2d out at right memory-format (#502)

* [dicp][ascend] add fusion switch file for ascend (#512)

* [dipu] Speedup profiler ctor when not enabled (#526)

* speedup profiler ctor

* clean & format include

* [DIPU]clang-tidy_shanhang (#516)

* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* Fdy/fix copy tidy (#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (#482)

* mock torch.cuda.XXXTensor (#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>

* Speedup dumpOnArgLevel by using lazy initialization (#524)

* [dicp][ascend] fuse transpose/mm in ascendgraph (#523)

* [dicp][ascend] remove unnecessary broadcast (#527)

* update kineto (#530)

* [dicp][ascend] opt inplace copy (#533)

* opt copy inplace

* optimzer load_and_run

* remove chech return value if (#534)

* [dipu] Optimize `getAllocator` by adopting lookup table (#532)

* [dipu] Optimize `getAllocator` by adopting lookup table

* fix typos & clean includes

* resolve comments

* shrink lookup table & speedup devproxy::getDeviceCount

* Op preference mem format (#525)

* add memory perference in op for camb.
This change will add a TAG in diopi_functions.yaml and the autogen will replace it with the prefered memory format depending on the convert_config.yaml of the device

* fix bug found in ci running

* improve the code according to the comment.

* improve code format.

* improve CMakeLists.txt code.

* lyp_clang_tidy: warning uint64_t->int (#518)

* clang_tidy:torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp
                                         CorrelationIDManager.h

* clang_tidy dipu/torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp .h

* clang_tidy:torch_dipu/csrc_dipu/profiler/profiler.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp --v2

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp -v2

* clang_tidy: dipu/torch_dipu/csrc_dipu/runtime/core/DIPUEvent.h

* clang_tidy: torch_dipu/csrc_dipu/profiler/profiler.h --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp .h --v2

* clang_tidy: magic number; const_cast

* clang_tidy: fix some review issus

* clang_tidy: modify format by using run_format.sh

* [dipu] fix: `torch.prod` int type promotion (#541)

`prod` (and other reduction ops) should promote int type (including `bool`) to `int64` when `dtype` is not explicitly provided.

Only `prod` (without `dim`) should be taken care of, because the other cases are already correctly handled in PyTorch.

* [dipu] fix typo PREFERED -> PREFERRED (#545)

* [dicp][ascend] add dicp ci for ascend (#540)

* disable autocompare for _amp_foreach_non_finite_check_and_unscale_ (#543)

* Update QuickStart.md

* revert unnecessary changes

* fix linter erros and implement getRuntimeVersion&getDriverVersion for kunlunxin

* change device from XPU to KLX

* fix build

* remove uused code

* use DIPU_LOG install of printf

* change kunlunxin device key from xpu to klx

---------

Co-authored-by: Chengyuan Li <[email protected]>
Co-authored-by: Aaron <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: ustclight-sls <[email protected]>
Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
Co-authored-by: lyp-liuyipeng <[email protected]>
Co-authored-by: zhaochaoxing <[email protected]>
  • Loading branch information
1 parent 155eea3 commit d032a06
Show file tree
Hide file tree
Showing 10 changed files with 438 additions and 1 deletion.
5 changes: 5 additions & 0 deletions dipu/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ list(APPEND DEVICE_ASCEND "ASCEND" "ascend")
list(APPEND DEVICE_TOPSRIDER "TOPS" "tops" "TOPSRIDER" "topsrider")
list(APPEND DEVICE_SUPA "SUPA" "supa")
list(APPEND DEVICE_DROPLET "DROPLET" "droplet")
list(APPEND DEVICE_KUNLUNXIN "kunlunxin" "klx")

execute_process(COMMAND git rev-parse --short HEAD
OUTPUT_VARIABLE DIPU_GIT_HASH)
Expand Down Expand Up @@ -50,6 +51,10 @@ elseif (${DEVICE} IN_LIST DEVICE_DROPLET)
set(USE_DROPLET ON)
set(UsedVendor droplet)
set(DIOPI_IMPL_OPT "droplet")
elseif (${DEVICE} IN_LIST DEVICE_KUNLUNXIN)
set(USE_KUNLUNXIN ON)
set(UsedVendor kunlunxin)
set(DIOPI_IMPL_OPT "kunlunxin")
else()
message(FATAL_ERROR "No implementation module is compiled, cmake requires option -DDEVICE=CAMB or CUDA or ASCEND or SUPA")
endif()
Expand Down
2 changes: 1 addition & 1 deletion dipu/scripts/autogen_diopi_wrapper/diopi_functions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2436,7 +2436,7 @@
- schema: copy_(Tensor(a!) self, Tensor src, bool non_blocking=False) -> Tensor(a!)
dummy_call_diopi: True
custom_fallback: True
device: [cuda, camb, ascend, droplet, supa]
device: [cuda, camb, ascend, droplet, supa, kunlunxin]
custom_code_at_the_beginning: |
dipu::getDipuCopyInstance()->run(self, src, non_blocking);
return self;
Expand Down
1 change: 1 addition & 0 deletions dipu/torch_dipu/csrc_dipu/runtime/device/basedef.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ enum class VendorDeviceType : enum_t {
GCU, // gcu
SUPA, // Biren
DROPLET, // droplet
KLX, // Kunlunxin
};

enum class EventStatus : enum_t { PENDING, RUNNING, DEFERRED, READY };
Expand Down
2 changes: 2 additions & 0 deletions dipu/torch_dipu/csrc_dipu/utils/helpfunc.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ constexpr const char* VendorTypeToStr(VendorDeviceType t) noexcept {
return "SUPA";
case VendorDeviceType::DROPLET:
return "DROPLET";
case VendorDeviceType::KLX:
return "KLX";
}
return "null";
}
Expand Down
14 changes: 14 additions & 0 deletions dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
cmake_minimum_required(VERSION 3.14)

set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_CURRENT_SOURCE_DIR}/cmake)

include(cmake/FindKLXRuntime.cmake)

message(STATUS XPURT_INCLUDE_DIR ${XPURT_INCLUDE_DIR})

set(VENDOR_INCLUDE_DIRS ${VENDOR_INCLUDE_DIRS} ${XPURT_INCLUDE_DIR} ${XDNN_INCLUDE_DIR} PARENT_SCOPE)
set(VENDOR_LIB_DIRS ${VENDOR_LIB_DIRS} ${XPURT_LIBRARIES} ${XDNN_LIBRARIES} PARENT_SCOPE)
#set(DIPU_VENDOR_LIB ${DIPU_VENDOR_LIB} xpurt xpuapi PARENT_SCOPE)

file(GLOB SRC_FILES *.cpp)
set(VENDOR_FILES ${SRC_FILES} PARENT_SCOPE)
29 changes: 29 additions & 0 deletions dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/KLXGeneratorImpl.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#include <ATen/Functions.h>
#include <ATen/Utils.h>

#include <csrc_dipu/runtime/core/DIPUGeneratorImpl.h>
#include <csrc_dipu/runtime/core/DIPUGuard.h>
#include <csrc_dipu/runtime/device/deviceapis.h>

namespace dipu {

// Discriminate floating device type.
// static bool is_floating_device = true;

// just an example
// not implemented now
class KLXGeneratorImpl : public dipu::DIPUGeneratorImpl {
public:
KLXGeneratorImpl(at::DeviceIndex device_index)
: dipu::DIPUGeneratorImpl(device_index) {}

void set_state(const c10::TensorImpl& state) override {}

void update_state() const override {}
};

const at::Generator vendorMakeGenerator(at::DeviceIndex device_index) {
return at::make_generator<KLXGeneratorImpl>(device_index);
}

} // namespace dipu
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
set(XPURT_TOOLKIT_ROOT /workspace/baidu/personal-code/diopi/xpu_toolchain/xpurt)
set(XDNN_TOOLKIT_ROOT /workspace/baidu/personal-code/diopi/xpu_toolchain/xdnn)

include(FindPackageHandleStandardArgs)

## xdnn
find_path(XDNN_INCLUDE_DIR
NAMES xpu/xdnn.h
HINTS ${XDNN_TOOLKIT_ROOT}/include
$ENV{XDNN_TOOLKIT_ROOT}/include
)
message("XDNN_INCLUDE_DIR:" ${XDNN_INCLUDE_DIR})
find_library(XDNN_LIBRARIES
NAMES xpuapi
HINTS ${XDNN_TOOLKIT_ROOT}/so
$ENV{XDNN_TOOLKIT_ROOT}/so
)
message("XDNN_TOOLKIT_ROOT: " ${XDNN_TOOLKIT_ROOT})
message("XDNN_LIBRARIES:" ${XDNN_LIBRARIES})
if(NOT XDNN_INCLUDE_DIR OR NOT XDNN_LIBRARIES)
message(FATAL_ERROR "Cannot find Xdnn TOOLKIT for kunlunxin, set ENV 'XDNN_TOOLKIT_ROOT' correctly")
endif()

## runtime
find_path(XPURT_INCLUDE_DIR
NAMES xpu/runtime.h
HINTS ${XPURT_TOOLKIT_ROOT}/include
$ENV{XPURT_TOOLKIT_ROOT}/include
)
message("XPURT_INCLUDE_DIR:" ${XPURT_INCLUDE_DIR})
find_library(XPURT_LIBRARIES
NAMES xpurt
HINTS ${XPURT_TOOLKIT_ROOT}/so
$ENV{XPURT_TOOLKIT_ROOT}/so
)
message("XPURT_LIBRARIES:" ${XPURT_LIBRARIES})
if(NOT XPURT_INCLUDE_DIR OR NOT XPURT_LIBRARIES)
message(FATAL_ERROR "Cannot find XPURT TOOLKIT for kunlunxin, set ENV 'XPURT_TOOLKIT_ROOT' correctly")
endif()

find_package_handle_standard_args(XPURT DEFAULT_MSG
XPURT_INCLUDE_DIR
XPURT_LIBRARIES)

find_package_handle_standard_args(XDNN DEFAULT_MSG
XDNN_INCLUDE_DIR
XDNN_LIBRARIES)

mark_as_advanced(XPURT_INCLUDE_DIR XPURT_LIBRARIES XDNN_INCLUDE_DIR XDNN_LIBRARIES)
81 changes: 81 additions & 0 deletions dipu/torch_dipu/csrc_dipu/vendor/kunlunxin/communicatorimpl.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#include <stdexcept>
#include <string>

#include <c10/core/ScalarType.h>
#include <torch/csrc/distributed/c10d/Types.hpp>

#include <csrc_dipu/common.h>
#include <csrc_dipu/runtime/device/diclapis.h>

namespace dipu {

namespace devapis {

const int DICL_UNIQUE_ID_BYTES_SIZE = 0;

DIPU_API diclResult_t diclGetCommAsyncError(diclComm_t comm) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclGetUniqueId(pcclUniqueId* uniqueId) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclCommInitRank(diclComm_t* comm, int nranks,
pcclUniqueId uniqueId, int rank,
int localDeviceId) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclCommDestroy(diclComm_t comm) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclAllReduce(const void* sendbuff, void* recvbuff,
size_t count, at::ScalarType datatype,
const ReduceOp& reduceOp, diclComm_t comm,
deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclBroadcast(const void* sendbuff, void* recvbuff,
size_t count, at::ScalarType datatype,
int root, diclComm_t comm,
deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclAllGather(const void* sendBuf, void* recvBuf,
size_t count, at::ScalarType datatype,
diclComm_t comm, deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclReduce(const void* sendbuff, void* recvbuff,
size_t count, at::ScalarType datatype,
const ReduceOp& reduceOp, int root,
diclComm_t comm, deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclReduceScatter(
void* sendBuf, void* recvBuf, size_t recvCount, at::ScalarType datatype,
const ReduceOp& reduceOp, diclComm_t comm, deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclSend(void* sendbuff, size_t count,
at::ScalarType datatype, int peer,
diclComm_t comm, deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

DIPU_API diclResult_t diclRecv(void* recvbuff, size_t count,
at::ScalarType datatype, int peer,
diclComm_t comm, deviceStream_t stream) {
return DICL_ERR_UNDEF;
}

} // end namespace devapis

} // end namespace dipu
Loading

0 comments on commit d032a06

Please sign in to comment.