You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1.One error should be solved
when install apex, there will be 4 erors about "convert unsigned long to long", you need to edit:
(1) line 65 in apex_22.01_pp/csrc/mlp.cpp auto reserved_space = at::empty({reserved_size}, inputs[0].type());
change to: auto reserved_space = at::empty({static_cast<long>(reserved_size)}, inputs[0].type());
(2) line 138 in apex_22.01_pp/csrc/mlp.cpp auto work_space = at::empty({work_size / sizeof(scalar_t)}, inputs[0].type());
change to: auto work_space = at::empty({static_cast<long>(work_size / sizeof(scalar_t))}, inputs[0].type());
or you need to change the compile option
2.one improvement for reducing the CUDA memory
when launch the owl_demo.py using a GPU with 16G, I ran into a CUDA memory overflow error. Then I edit here:
line 33 and 34 in interface.py:
model = model.to(device)
model = model.to(dtype)
change to:
model = model.to(dtype)
model = model.to(device)
Then, After the demo is started, the memory usage is about 14 GB. It can run very well on a 16GB GPU.
The text was updated successfully, but these errors were encountered:
1.One error should be solved
when install apex, there will be 4 erors about "convert unsigned long to long", you need to edit:
(1) line 65 in apex_22.01_pp/csrc/mlp.cpp
auto reserved_space = at::empty({reserved_size}, inputs[0].type());
change to:
auto reserved_space = at::empty({static_cast<long>(reserved_size)}, inputs[0].type());
(2) line 138 in apex_22.01_pp/csrc/mlp.cpp
auto work_space = at::empty({work_size / sizeof(scalar_t)}, inputs[0].type());
change to:
auto work_space = at::empty({static_cast<long>(work_size / sizeof(scalar_t))}, inputs[0].type());
or you need to change the compile option
2.one improvement for reducing the CUDA memory
when launch the owl_demo.py using a GPU with 16G, I ran into a CUDA memory overflow error. Then I edit here:
line 33 and 34 in interface.py:
change to:
Then, After the demo is started, the memory usage is about 14 GB. It can run very well on a 16GB GPU.
The text was updated successfully, but these errors were encountered: