You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to use mim to submit training tasks asynchronously on Slurm, using the following command: mim train mmcls resnet101_b16x8_cifar10.py --launcher slurm --gpus 1 --gpus-per-node 1 --partition aide_dev --work-dir tmp --srun-args "--async -o /mnt/petrelfs/gaoshiqi/"
In order to be able to commit asynchronously on slurm and redirects the log to /mnt/petrelfs/gaoshiqi/, I added the parameter --srun-args "--async -o /mnt/petrelfs/gaoshiqi/"
However, the execution of the command success but the task is not committed to the Slurm cluster, and I cant find my log /mnt/petrelfs/gaoshiqi/phoenix-slurm-5181985.out.
the log is as follows:
Trying to find my log, but not exited:
The text was updated successfully, but these errors were encountered:
I found the cause of the problem: after using asynchronous submission, a batchscript is automatically generated, and I found a problem with the content.
If the job-name parameter is not added, mim will automatically generate it, causing the batchscript content to be misplaced and thus causing the task submission to fail.
I tried to use mim to submit training tasks asynchronously on Slurm, using the following command:
mim train mmcls resnet101_b16x8_cifar10.py --launcher slurm --gpus 1 --gpus-per-node 1 --partition aide_dev --work-dir tmp --srun-args "--async -o /mnt/petrelfs/gaoshiqi/"
In order to be able to commit asynchronously on slurm and redirects the log to /mnt/petrelfs/gaoshiqi/, I added the parameter
--srun-args "--async -o /mnt/petrelfs/gaoshiqi/"
However, the execution of the command success but the task is not committed to the Slurm cluster, and I cant find my log /mnt/petrelfs/gaoshiqi/phoenix-slurm-5181985.out.
the log is as follows:
Trying to find my log, but not exited:
The text was updated successfully, but these errors were encountered: