Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dipu]add ascend profiler #476

Merged
merged 7 commits into from
Nov 30, 2023
Merged

[dipu]add ascend profiler #476

merged 7 commits into from
Nov 30, 2023

Conversation

caikun-pjlab
Copy link
Contributor

@caikun-pjlab caikun-pjlab commented Nov 29, 2023

接入华为Profiling AscendCL API,采集性能打点。

目前发现某些场景下采集数据会有缺失,在跟 社区沟通中 https://gitee.com/ascend/modelzoo/issues/I8KEBB?from=project-issue

@caikun-pjlab caikun-pjlab added the DIPU DIPU related label Nov 29, 2023
@caikun-pjlab caikun-pjlab changed the title Caikun/add ascend profiler [DIPU]add ascend profiler Nov 29, 2023
@caikun-pjlab caikun-pjlab changed the title [DIPU]add ascend profiler [dipu]add ascend profiler Nov 29, 2023
Copy link
Collaborator

@lljbash lljbash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test 部分没问题

@caikun-pjlab caikun-pjlab merged commit 84ee546 into main Nov 30, 2023
19 checks passed
KevinfromTJ pushed a commit that referenced this pull request Dec 4, 2023
* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming
ustclight-sls pushed a commit to DeepLink-org/deeplink.framework.dev that referenced this pull request Dec 8, 2023
* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming
mrdanielw pushed a commit that referenced this pull request Dec 13, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* Fdy/fix copy tidy (#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (#482)

* mock torch.cuda.XXXTensor (#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
brianlcy123 pushed a commit to brianlcy123/deeplink.framework that referenced this pull request Dec 21, 2023
* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (DeepLink-org#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (DeepLink-org#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (DeepLink-org#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* Fdy/fix copy tidy (DeepLink-org#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (DeepLink-org#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (DeepLink-org#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (DeepLink-org#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (DeepLink-org#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (DeepLink-org#428)

* [dipu] Fix copy_ fallback of topsrider. (DeepLink-org#477)

* [dicp][tops] Add dicp ci of tops. (DeepLink-org#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (DeepLink-org#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (DeepLink-org#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (DeepLink-org#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (DeepLink-org#482)

* mock torch.cuda.XXXTensor (DeepLink-org#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (DeepLink-org#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (DeepLink-org#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (DeepLink-org#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (DeepLink-org#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (DeepLink-org#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (DeepLink-org#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (DeepLink-org#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (DeepLink-org#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (DeepLink-org#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (DeepLink-org#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (DeepLink-org#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (DeepLink-org#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (DeepLink-org#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (DeepLink-org#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (DeepLink-org#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (DeepLink-org#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (DeepLink-org#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (DeepLink-org#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
caikun-pjlab added a commit that referenced this pull request Dec 22, 2023
* add kunlunxin backend

* add kunlunxin device

* update copy_ for kunlunxin

* lcy/clang-tidy (#483)

* fix namespace declaration format

* update diopi_functions.yaml

* update clang-tidy

* update clang-tidy

* change tab into spaces

* allow const_cast

* fix bug

* fix comment

* fix comments

* fix comments

* [FIX] fix virtual memory error of using SUPA (#468)

* [FIX] fix virtual memory of SUPA

* [FIX] fix incorrect copy

* [FIX] remove useless copy and add missing 'supa'in cmakelists.txt

* make conv2d out at right memory-format (#502)

* [dicp][ascend] add fusion switch file for ascend (#512)

* [dipu] Speedup profiler ctor when not enabled (#526)

* speedup profiler ctor

* clean & format include

* [DIPU]clang-tidy_shanhang (#516)

* Create main readme

* Update readme.md

* Update readme.md

* Update readme.md

* add clone kineto for dicp (#457)

add clone kineto for dicp

* [dicp][ascend] infer op result_info (#448)

* finish res_op_infer for softmax+log_softmax+add+amax(keepdim=True) pass static test

* repeal modification to diopi

* modify operator logic in /DIPU/dicp/dicp/dynamo_bridge/operator.py to support test of'infer_result'

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* fix gettupleelem in topsgraph

---------

Co-authored-by: jinminxi104 <[email protected]>

* Fdy/enhance copy (#430)

* mv vopy file path

* add new copy

* fix static param err

* fix copy err

* fix direct copy bug

* rm unused bcast template name

* change clang format

* change name hpp

* rm unused header file

* remove unused header 2

* change override behavior

* change comment

* change cudacopy

* fix d2d copy err

* change register to use autogen

* revert incorrect format

* config fallback

* fix link err

* fix comment wanglei

* add newline

* fix cpu copy err

* add camb vendor copy

* fix copy err

* fix copy err 2

* fix compile err

* fix lingjie comment1

* fix caikun comment

* fix camb ci

* fix camb ci

* fix device switch err

* fix ling jie caikun comment 2

* fix comment incorrect local  ref

* change init copy

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* Fdy/fix copy tidy (#471)

* fix tidy 0

* fix clang tidy copy

* fix lingjie comment

* add tidy msg

* fix lint comment

* fix format

* add copy right

* fuj/ add ceil.out (#480)

* add ceil.out

* add floor_ and cases for floor_, ceil and ceil_

* [dipu] tidy some source files and update nv build script (#453)

* fix: tidy some source files
- and also update build nv script

* fix: make clang-format v16 happy

* fix: make clang-format v16 happy

* fix: remove usings and simplify some code

* fix: remove index

* fix: remove initialized_

* fix: add keyword VERSION

* fix: remove VERSION 3.25 as CI is using CMake 3.22

* add 910B CI && remove 910 CI && update DIOPI (#481)

* add 910b

* add 910b

* add 910b

* add 910b

* add resnet50

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* fix bugs

* rm nouse code

* update DIOPI submodule (#458)

* update DIOPI submodule

* diopi update to main

* update mmcv version

* update submodule

* update mmcv commit id

* feat: pass CMAKE_BUILD_TYPE into DIOPI (#428)

* [dipu] Fix copy_ fallback of topsrider. (#477)

* [dicp][tops] Add dicp ci of tops. (#469)

* Add dicp ci of tops.

* Fix dicp ci of tops.

* fix recycle dep (#474)

* rm 910 ci

* update diopi

* rm 910

---------

Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: wugeshui <[email protected]>

* [dipu]add ascend profiler (#476)

* add ascend profiler

* support with_stack

* code format

* fix clang tidy

* optimize naming

* optimize naming

* add dipu ci on dicp (#488)

* [dicp][ascend] fix ascend mm/bmm on 910B (#482)

* mock torch.cuda.XXXTensor (#462)

* mock torch.cuda.XXXTensor

* add newline at end of file

* fix conflict

* fix format

* fix format

* fix comment

* Fix `multiprocessing.Process` tests not collected by coverage and gcov (#486)

* Fix `multiprocessing.Process` tests not collected by coverage and gcov

* fix --concurrency=multiprocessing

* [dipu] update tidy configuration and remove if-constexpr in C++14 (#470)

* fix: update tidy config and remove if-constexpr

* fix: it should be a list instead of bool value

* feat: update clangd config

* fix: move the comment out of yaml scalar

* docs: add comments

* fix: add DeviceIndex

* fix: add some checks for headers

* feat: update .clang-tidy

* add profiler readme (#489)

* add profiler readme

* Update readme.md

* update

* Update readme.md

* Update readme.md

* Update readme.md

---------

Co-authored-by: caikun-pjlab <[email protected]>

* [dicp][tops] support outputs with inplace copy (#440)

* add dipu stream synchronize.

* adjust some ops.

* fix some paras error and rename device name.

* unset keep_inference_input_mutations.

* fix paras error in conversion.

* fix para dtype conversion.

* fix empty output and inplace copy of input paras in optimizer case.

* remove inplace output gen_empty_tensor.

* Ywt/fix autocompare compile error (#492)

* pass string to python

* disable _amp_foreach_non_finite_check_and_unscale_ autocompare

* [dipu] Wx/support the test for llm inference (#454)

* add one iter for llm

* add bert ci using the correct transformers repository

* add test for the inference of llama 7b using the transformers repository

* one iter test for traditional models by default

* fix bug

* add test for the inference of internlm 7b using the transformers repository

* test for torch_dipu

* set device check args other for maximum.out

* fix the partition arg parsing bug on cuda

* test the setting of CUDA_PARTITION

* fix the bug of setting CUDA_PARTATION

* add llm

* add llm

* optimize the selection of model list

* set pythonpath for torch_dipu

* test

* fix bug in the command of setting pythonpath

---------

Co-authored-by: wugeshui <[email protected]>

* [DIPU]Wx/check the status of build dipu (#490)

* check the status of build dipu on camb and nv

* add check for ascend

* fix the bug of pipe

* [DIPU] Wx/add schema for logical or and logical not ops (#484)

* add schema for logical or and logical not ops

* fix bug and add test cases for these ops

* add the test case: out is empty tensor

* [dicp][ascend] infer op resinfo (part 2) (#491)

* fix a bug in get_cast_dtype: type(int+bool) should be int

* clean code format

* finish res_op_infer for more simple operators

* Update operator.py

delete some unnecessary print()

* Update operator.py

clean code

* finish operators' info inference except for those having trouble testing solely without inference and operators involving Reshape still have problems

* clean code format

* Update warning message output in operator.py

* extract common function for general binary and unary operator ,add op bmm's inference

* Update ascend_op.py

delete unuse param

* update DIOPI submodule (#485)

* update DIOPI submodule

* update submodule

* temporily forbid resnet50

* move the testing code to dir under torch_dipu (#465)

* move the testing code to dir under torch_dipu

* fix a little bug

* create two soft link to avoid import torch_dipu  too early.

* add one more soft link file to solve bugs.

* support dev fork ci (#496)

* support dev fork ci

* [dipu] add markdownlint and update most markdown files (#493)

* doc: update docs and add markdownlint

* doc: rename readme.md to README.md

* fix: remove MD013

* doc: format

* [dicp][tops] Support some ops for stable-diffusion. (#467)

* Add sin, cos, erf, split.

1. Generalize MakeTuple in tops_op.
2. Generalize make_const in enflame codegen.
3. Add sin, cos, erf, split for tops.
4. Format Python code in dicp tops.

* refine code

* fix abs test path

* clean up code of split.

* adjust const op generation.

* fix nullptr case in const generation.

---------

Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>

* [DIPU] Wx/modify maximum schema due to the case in the inference of internlm (#494)

* improve maximum schema due to the case in the inference of internlm

* fix bug according to comments

* fix bug

* [both] fix, format and remove spaces in README.md (#497)

* doc(readme): fix, format and remove spaces

* fix: typo and try auto-correct

* feat(ci): add autocorrect into ci

* fix: remove autocorrect form ci as it's not ready

* update env python 3.10 (#503)

* fix clang tidy

* [dicp][ascend] get soc_version from aclrt (#505)

* fix clang tidy

* fix format

* fix format

---------

Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>

* Speedup dumpOnArgLevel by using lazy initialization (#524)

* [dicp][ascend] fuse transpose/mm in ascendgraph (#523)

* [dicp][ascend] remove unnecessary broadcast (#527)

* update kineto (#530)

* [dicp][ascend] opt inplace copy (#533)

* opt copy inplace

* optimzer load_and_run

* remove chech return value if (#534)

* [dipu] Optimize `getAllocator` by adopting lookup table (#532)

* [dipu] Optimize `getAllocator` by adopting lookup table

* fix typos & clean includes

* resolve comments

* shrink lookup table & speedup devproxy::getDeviceCount

* Op preference mem format (#525)

* add memory perference in op for camb.
This change will add a TAG in diopi_functions.yaml and the autogen will replace it with the prefered memory format depending on the convert_config.yaml of the device

* fix bug found in ci running

* improve the code according to the comment.

* improve code format.

* improve CMakeLists.txt code.

* lyp_clang_tidy: warning uint64_t->int (#518)

* clang_tidy:torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp
                                         CorrelationIDManager.h

* clang_tidy dipu/torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp .h

* clang_tidy:torch_dipu/csrc_dipu/profiler/profiler.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp

* clang_tidy:torch_dipu/csrc_dipu/profiler/patch.cpp --v2

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp

* clang_tidy:dipu/torch_dipu/csrc_dipu/runtime/core/allocator/DIPUBFCachingAllocator.cpp -v2

* clang_tidy: dipu/torch_dipu/csrc_dipu/runtime/core/DIPUEvent.h

* clang_tidy: torch_dipu/csrc_dipu/profiler/profiler.h --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/DIPUDeviceActivity.cpp --v2

* clang_tidy: torch_dipu/csrc_dipu/profiler/CorrelationIDManager.cpp .h --v2

* clang_tidy: magic number; const_cast

* clang_tidy: fix some review issus

* clang_tidy: modify format by using run_format.sh

* [dipu] fix: `torch.prod` int type promotion (#541)

`prod` (and other reduction ops) should promote int type (including `bool`) to `int64` when `dtype` is not explicitly provided.

Only `prod` (without `dim`) should be taken care of, because the other cases are already correctly handled in PyTorch.

* [dipu] fix typo PREFERED -> PREFERRED (#545)

* [dicp][ascend] add dicp ci for ascend (#540)

* disable autocompare for _amp_foreach_non_finite_check_and_unscale_ (#543)

* Update QuickStart.md

* revert unnecessary changes

* fix linter erros and implement getRuntimeVersion&getDriverVersion for kunlunxin

* change device from XPU to KLX

* fix build

* remove uused code

* use DIPU_LOG install of printf

* change kunlunxin device key from xpu to klx

---------

Co-authored-by: Chengyuan Li <[email protected]>
Co-authored-by: Aaron <[email protected]>
Co-authored-by: wyz5864 <[email protected]>
Co-authored-by: tangzhiyi11 <[email protected]>
Co-authored-by: Lingjie <[email protected]>
Co-authored-by: ustclight-sls <[email protected]>
Co-authored-by: MiaoYYu <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: Juntao Chen <[email protected]>
Co-authored-by: jinminxi104 <[email protected]>
Co-authored-by: fandaoyi <[email protected]>
Co-authored-by: Peter Ye <[email protected]>
Co-authored-by: wiryls <[email protected]>
Co-authored-by: yaofengchen <[email protected]>
Co-authored-by: Fu Jingguo <[email protected]>
Co-authored-by: hellozmz <[email protected]>
Co-authored-by: wugeshui <[email protected]>
Co-authored-by: CyCle1024 <[email protected]>
Co-authored-by: caikun-pjlab <[email protected]>
Co-authored-by: Joyce YU <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: POI-WX <[email protected]>
Co-authored-by: HuayiL <[email protected]>
Co-authored-by: Reinerzhou <[email protected]>
Co-authored-by: liwenjian-sensetime <[email protected]>
Co-authored-by: shanhang <[email protected]>
Co-authored-by: lyp-liuyipeng <[email protected]>
Co-authored-by: zhaochaoxing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DIPU DIPU related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants