Skip to content

Commit

Permalink
workDir and jobStore now default to (shared) tmp-outdir-prefix (#5154)
Browse files Browse the repository at this point in the history
* Let workDir and jobStore default to tmp-outdir-prefix

This commit makes the following changes to the behaviour of the given command-line options:
* `tmp-outdir-prefix` defaults to `tmpdir-prefix`, unless given on the command-line
* `workDir` defaults to `tmp-outdir-prefix`, unless given on the command-line
* `jobStore` defaults to `tmp-outdir-prefix`, unless given on the command-line
* `coordinationDir` defaults to the default tmpdir-prefix, ignoring `tmpdir-prefix` when given on the command-line (rationale: this is a book-keeping location, that must be on a local 100% posix-compliant file system, because it uses file locks).

* Update documentation

Updated the CLI documention for the `--workDir` option.

* Jobstore needs its own directory

The jobstore cannot be put inside the working directory, because it may need to be retained (e.g. when --stats is set).
It now gets its own (temporary) directory, if not specified with the --jobstore option.

* Do not create working directory on head node

We do not need to create a working directory on the head node. We only need to create our jobstore here.

* No need to set options.tmp_outdir_prefix

There's no need to set `options.tmp_outdir_prefix` here. It is not done in the current `master` branch either.

* Improve documentation of workDir option

Improved the documentation of the `--workDir` option, by adding an explanation that the `--tmp-outdir-prefix` will be used for CWL workflows.

* Update comment

---------

Co-authored-by: stxue1 <[email protected]>
Co-authored-by: stxue1 <[email protected]>
  • Loading branch information
3 people authored Dec 2, 2024
1 parent 701f721 commit 89d8e53
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 27 deletions.
4 changes: 3 additions & 1 deletion docs/running/cliOptions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,9 @@ about the performance of jobs.
directory ``toil-<workflowID>`` within workDir. The
workflowID is generated by Toil and will be reported in
the workflow logs. Default is determined by the
variables (TMPDIR, TEMP, TMP) via mkdtemp. This
variables (TMPDIR, TEMP, TMP) via mkdtemp. For CWL,
the temporary output directory is used instead
(see CWL option ``--tmp-outdir-prefix``). This
directory needs to exist on all machines running jobs;
if capturing standard output and error from batch
system jobs is desired, it will generally need to be on
Expand Down
32 changes: 6 additions & 26 deletions src/toil/cwl/cwltoil.py
Original file line number Diff line number Diff line change
Expand Up @@ -3918,42 +3918,25 @@ def main(args: Optional[list[str]] = None, stdout: TextIO = sys.stdout) -> int:
tmpdir_prefix = options.tmpdir_prefix = (
options.tmpdir_prefix or DEFAULT_TMPDIR_PREFIX
)

# We need a workdir for the CWL runtime contexts.
if tmpdir_prefix != DEFAULT_TMPDIR_PREFIX:
# if tmpdir_prefix is not the default value, move
# workdir and the default job store under it
workdir = cwltool.utils.create_tmp_dir(tmpdir_prefix)
else:
# Use a directory in the default tmpdir
workdir = mkdtemp()
# Make sure workdir doesn't exist so it can be a job store
os.rmdir(workdir)
tmp_outdir_prefix = options.tmp_outdir_prefix or tmpdir_prefix
workdir = options.workDir or tmp_outdir_prefix

if options.jobStore is None:
jobstore = cwltool.utils.create_tmp_dir(tmp_outdir_prefix)
# Make sure directory doesn't exist so it can be a job store
os.rmdir(jobstore)
# Pick a default job store specifier appropriate to our choice of batch
# system and provisioner and installed modules, given this available
# local directory name. Fail if no good default can be used.
options.jobStore = generate_default_job_store(
options.batchSystem, options.provisioner, workdir
options.batchSystem, options.provisioner, jobstore
)

options.doc_cache = True
options.disable_js_validation = False
options.do_validate = True
options.pack = False
options.print_subgraph = False
if tmpdir_prefix != DEFAULT_TMPDIR_PREFIX and options.workDir is None:
# We need to override workDir because by default Toil will pick
# somewhere under the system temp directory if unset, ignoring
# --tmpdir-prefix.
#
# If set, workDir needs to exist, so we directly use the prefix
options.workDir = cwltool.utils.create_tmp_dir(tmpdir_prefix)
if tmpdir_prefix != DEFAULT_TMPDIR_PREFIX and options.coordination_dir is None:
# override coordination_dir as default Toil will pick somewhere else
# ignoring --tmpdir_prefix
options.coordination_dir = cwltool.utils.create_tmp_dir(tmpdir_prefix)

if options.batchSystem == "kubernetes":
# Containers under Kubernetes can only run in Singularity
Expand All @@ -3971,9 +3954,6 @@ def main(args: Optional[list[str]] = None, stdout: TextIO = sys.stdout) -> int:
logger.debug(f"Final job store {options.jobStore} and workDir {options.workDir}")

outdir = os.path.abspath(options.outdir or os.getcwd())
tmp_outdir_prefix = os.path.abspath(
options.tmp_outdir_prefix or DEFAULT_TMPDIR_PREFIX
)
conf_file = getattr(options, "beta_dependency_resolvers_configuration", None)
use_conda_dependencies = getattr(options, "beta_conda_dependencies", None)
job_script_provider = None
Expand Down

0 comments on commit 89d8e53

Please sign in to comment.