Skip to content

Commit

Permalink
Merge pull request #195 from university-of-york/bugfix/quickFixes
Browse files Browse the repository at this point in the history
add FAQ about jobs not starting
  • Loading branch information
nd996 authored Apr 2, 2024
2 parents f4f10a8 + a59ce77 commit b56af63
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 0 deletions.
14 changes: 14 additions & 0 deletions docs/source/faq/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,17 @@ Or if you're using the `srun <https://slurm.schedmd.com/srun.html>`_ or `salloc
--exclude=node123
-x node123
Why hasn't my job started?
^^^^^^^^^^^^^^^^^^^^^^^^^^

Jobs submitted to the Slurm job scheduler will sometimes take some time before they start running. This can be for a number of reasons for example how busy the particular partition is that the job was submitted to or how many resources the job is requesting. It's always a good idea to request only the :ref:`resources <job_resources>` your job requires. You can check the jobs you have in the queue with the following command:

.. code-block:: console
$ squeue -u $USER
If you see the reason for a job being held as ``QOSGrpGRES`` then this means a resource has reached its limit for that partition. For example, on the ``gpu_week`` partition only a total of three GPUs are allowed to be used by all users at the same time (on that particular partition). When this limit is reached all new jobs to the queue will be held with that reason code.

For more information there is a full list of `reason codes <https://slurm.schedmd.com/resource_limits.html#reasons>`_.
2 changes: 2 additions & 0 deletions docs/source/using_viking/submitting_jobs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ If you filled out the ``--mail-user`` option you will get an email when the job
Tips and best practices
-----------------------

.. _job_resources:

Resource requests
^^^^^^^^^^^^^^^^^

Expand Down

0 comments on commit b56af63

Please sign in to comment.