Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dicp][tops] Run llama_finetune success. #440

Merged
merged 8 commits into from
Dec 4, 2023
Merged

Conversation

Reinerzhou
Copy link
Member

No description provided.

@Reinerzhou Reinerzhou force-pushed the zhousl/llama_finetune_2 branch 2 times, most recently from 24594ed to 9d10a4a Compare November 22, 2023 02:57
@jinminxi104 jinminxi104 changed the title [dicp] Run llama_finetune success. [dicp][tops] Run llama_finetune success. Nov 22, 2023
@jinminxi104 jinminxi104 added the DICP DICP related label Nov 22, 2023
dicp/dicp/vendor/TopsGraph/codegen/enflame.py Outdated Show resolved Hide resolved
dicp/dicp/vendor/TopsGraph/codegen/enflame.py Outdated Show resolved Hide resolved
dicp/dicp/vendor/TopsGraph/codegen/enflame.py Outdated Show resolved Hide resolved
@Reinerzhou Reinerzhou force-pushed the zhousl/llama_finetune_2 branch 2 times, most recently from 7e04f52 to 6def324 Compare November 24, 2023 08:05
@Reinerzhou Reinerzhou force-pushed the zhousl/llama_finetune_2 branch 2 times, most recently from 6e28374 to 0ed522d Compare November 28, 2023 09:49
@Reinerzhou Reinerzhou force-pushed the zhousl/llama_finetune_2 branch from 0ed522d to b334150 Compare November 29, 2023 02:51
@jinminxi104 jinminxi104 merged commit d5900e7 into main Dec 4, 2023
7 checks passed
@jinminxi104 jinminxi104 deleted the zhousl/llama_finetune_2 branch December 4, 2023 07:42
LeungChiNan pushed a commit to DeepLink-org/deeplink.framework.dev that referenced this pull request Dec 8, 2023
ustclight-sls pushed a commit to DeepLink-org/deeplink.framework.dev that referenced this pull request Dec 8, 2023
* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.
mrdanielw pushed a commit that referenced this pull request Dec 13, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* Fdy/fix copy tidy (#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (#482)

* mock torch.cuda.XXXTensor (#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
caikun-pjlab added a commit that referenced this pull request Dec 22, 2023
* add kunlunxin backend

* add kunlunxin device

* update copy_ for kunlunxin

* lcy/clang-tidy (#483)

* fix namespace declaration format

* update diopi_functions.yaml

* update clang-tidy

* update clang-tidy

* change tab into spaces

* allow const_cast

* fix bug

* fix comment

* fix comments

* fix comments

* [FIX] fix virtual memory error of using SUPA (#468)

* [FIX] fix virtual memory of SUPA

* [FIX] fix incorrect copy

* [FIX] remove useless copy and add missing 'supa'in cmakelists.txt

* make conv2d out at right memory-format (#502)

* [dicp][ascend] add fusion switch file for ascend (#512)

* [dipu] Speedup profiler ctor when not enabled (#526)

* speedup profiler ctor

* clean & format include

* [DIPU]clang-tidy_shanhang (#516)

* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* Fdy/fix copy tidy (#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (#482)

* mock torch.cuda.XXXTensor (#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>

* Speedup dumpOnArgLevel by using lazy initialization (#524)

* [dicp][ascend] fuse transpose/mm in ascendgraph (#523)

* [dicp][ascend] remove unnecessary broadcast (#527)

* update kineto (#530)

* [dicp][ascend] opt inplace copy (#533)

* opt copy inplace

* optimzer load_and_run

* remove chech return value if (#534)

* [dipu] Optimize `getAllocator` by adopting lookup table (#532)

* [dipu] Optimize `getAllocator` by adopting lookup table

* fix typos & clean includes

* resolve comments

* shrink lookup table & speedup devproxy::getDeviceCount

* Op preference mem format (#525)

* add memory perference in op for camb.
This change will add a TAG in diopi_functions.yaml and the autogen will replace it with the prefered memory format depending on the convert_config.yaml of the device

* fix bug found in ci running

* improve the code according to the comment.

* improve code format.

* improve CMakeLists.txt code.

* lyp_clang_tidy: warning uint64_t->int (#518)

* clang_tidy:torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp
                                         CorrelationIDManager.h

* clang_tidy dipu/torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp .h

* clang_tidy:torch_dipu/csrc_dipu/profiler/profiler.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp --v2

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp -v2

* clang_tidy: dipu/torch_dipu/csrc_dipu/runtime/core/DIPUEvent.h

* clang_tidy: torch_dipu/csrc_dipu/profiler/profiler.h --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp .h --v2

* clang_tidy: magic number; const_cast

* clang_tidy: fix some review issus

* clang_tidy: modify format by using run_format.sh

* [dipu] fix: `torch.prod` int type promotion (#541)

`prod` (and other reduction ops) should promote int type (including `bool`) to `int64` when `dtype` is not explicitly provided.

Only `prod` (without `dim`) should be taken care of, because the other cases are already correctly handled in PyTorch.

* [dipu] fix typo PREFERED -> PREFERRED (#545)

* [dicp][ascend] add dicp ci for ascend (#540)

* disable autocompare for _amp_foreach_non_finite_check_and_unscale_ (#543)

* Update QuickStart.md

* revert unnecessary changes

* fix linter erros and implement getRuntimeVersion&getDriverVersion for kunlunxin

* change device from XPU to KLX

* fix build

* remove uused code

* use DIPU_LOG install of printf

* change kunlunxin device key from xpu to klx

---------

Co-authored-by: Chengyuan Li <[email protected]>
Co-authored-by: Aaron <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: ustclight-sls <[email protected]>
Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
Co-authored-by: lyp-liuyipeng <[email protected]>
Co-authored-by: zhaochaoxing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DICP DICP related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants