You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Could you provide instructions on how to run experiments under the multi-node multi-gpu setting without using the submitit? For example, I have 2 nodes, each of which contains 16 gpus. How should I modify the scripts you provide to reproduce the reported results?
Thanks!
The text was updated successfully, but these errors were encountered:
This is the part where we set up distributed training.
If you use MPI to launch multi-node training, you can replace --submit with --summit in *.sh.
Depending on how you launch multi-node training, you might need to use this part or this part.
They differ in how they set up environment variables.
It might be hard for me to tell what is exactly the modification you need to have given the information you have.
Could you provide instructions on how to run experiments under the multi-node multi-gpu setting without using the submitit? For example, I have 2 nodes, each of which contains 16 gpus. How should I modify the scripts you provide to reproduce the reported results?
Thanks!
The text was updated successfully, but these errors were encountered: