Can't get Deepspeed to work on Ubuntu 22.04 #3531
AlpsAficionado
started this conversation in
General
Replies: 2 comments
-
Here is the output of 'ds_report' in my working conda environment in the VM.
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Just same error... |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello esteemed Deepspeed community.
I've spent several hours bashing my head against getting deepspeed to function properly on my system. I run oobabooga/text-generation-webui inside an Ubuntu 22.04 VM on my server (Nvidia Quadro RTX 8000 with 48 VRAM; 128GB system RAM; the VM is given total control of the 8000 via PCI passthrough/iommu, and has 96GB of system RAM allocated to it).
After hours of frustration, I was unable to get deepspeed to function with oobabooga on Ubuntu 22.04.
I was able to get it to run on Ubuntu 23.04; however, 22.04 is the current long-term supported version (and it still took me significant manual intervention to get it going in 23.04).
A rundown of my issues with 22.04:
Error:
ModuleNotFoundError: No module named 'deepspeed'
Solution:
pip install deepspeed
Error: ModuleNotFoundError: No module named 'mpi4py'
Solution:
sudo apt install libopenmpi-dev ; pip install mpi4py
Error: 'pip install mpi4py' won't work; it crashes like so:
Solution:
env LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ pip install --no-cache-dir mpi4py
Solution: conda install -c conda-forge gcc=12.1.0 (rebuilds/reinstalls a whole bunch of crap, see below:
Full error spew:
Solution: I don't know; this is where I am stuck. #1037 suggests that I just need to 'apt install libaio-dev', but I've done that and it doesn't help.
I'm still stuck and cannot for the life of me get the --deepspeed option working on 22.04. I'd truly appreciate any help. Thanks to all in advance!
Beta Was this translation helpful? Give feedback.
All reactions