Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fdy/support hf lora amp #308

Merged
merged 13 commits into from
Oct 10, 2023
Merged

Fdy/support hf lora amp #308

merged 13 commits into from
Oct 10, 2023

Conversation

fandaoyi
Copy link
Collaborator

@fandaoyi fandaoyi commented Sep 22, 2023

  1. 重写了部分 torch/csrc/device 的逻辑,dipu 可以在python层显示为 ‘cuda’ (包括 device, print tensor, generator, storage 都会显示 ‘device=cuda’)。 并fix了部分受影响的ci 和其他部分(主要是storage) (camb 的test 受影响比较多,主要是目前test 的设备还是 ’dipu‘导致的, 另外 改为cuda 会多跑一些测试。 后续计划先使用 DIPU_PYTHON_DEVICE_AS_CUDA= false 暂时绕过。 等凌劼的 ci梳理后一起处理。)
  2. 新增了一个弱符号接口 getDeviceStatus() 用于获取 mem_info 等设备实时信息, 另外对原来的 getDevicePropertiesFromCache 和新接口做了一定程度的统一。
  3. 删除了部分多余的 generator 逻辑(已经和蔡坤同步过)。

Copy link
Collaborator

@lljbash lljbash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个这两天会合吗,我最近会把 test_op 整个目录改掉

lljbash
lljbash previously requested changes Oct 9, 2023
dipu/torch_dipu/csrc_dipu/binding/patchCsrcDevice.cpp Outdated Show resolved Hide resolved
return THPUtils_packString(oss.str().c_str());
}

static struct PyGetSetDef DIPU_THPDevice_properties[] = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constexpr

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该不能用 const, 后面要赋值给 tp_getset, 那个不是 const的。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这居然是个数组- -看起来更可怕了...
好像也没啥好改法,要不然加个注释说这玩意儿传给 pytorch 随他用去了

dipu/torch_dipu/csrc_dipu/runtime/devproxy/deviceproxy.cpp Outdated Show resolved Hide resolved
dipu/torch_dipu/csrc_dipu/runtime/devproxy/deviceproxy.cpp Outdated Show resolved Hide resolved
dipu/torch_dipu/csrc_dipu/runtime/devproxy/deviceproxy.cpp Outdated Show resolved Hide resolved
using dipu::devapis::DIPUDeviceProperties;
using c10::DeviceIndex;

DeviceIndex num_gpus = -1;
c10::once_flag init_flag;
std::deque<c10::once_flag> device_flags;
std::vector<DIPUDeviceProperties> device_properties;
std::vector<shared_ptr<DIPUDeviceProperties>> device_properties;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里相关的修改的目的是什么

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

原来的逻辑会把裸指针传到 python层。用 shared_ptr pybind11 可以和cpp层共同管理 生命周期。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

所以这个对象在 python 层可能会活得比 cpp 层久嘛

Copy link
Collaborator Author

@fandaoyi fandaoyi Oct 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该不会。但是之前传一个裸指针过去, 还要改变 pybind 默认的析构行为 py::return_value_policy::reference,还是有些奇怪的。也不太符合 pybind 推荐的模式

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@fandaoyi fandaoyi requested a review from lljbash October 10, 2023 04:21
@fandaoyi fandaoyi dismissed lljbash’s stale review October 10, 2023 07:02

seems remaining comments are enhanced suggestions, not mandatory。 they shouldn’t be ‘request’ but ‘comments’

@fandaoyi fandaoyi merged commit 0a371a8 into main Oct 10, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants