Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slurm processes_per_node #103

Merged
merged 2 commits into from
Oct 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,13 @@ The following parameters are found in `r/run_stilt.r` and are used to configure

### Parallel simulation settings

| Arg | Description |
| --------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `n_nodes` | If using SLURM for job submission, number of nodes to utilize |
| `n_cores` | Number of cores per node to parallelize simulations by receptor locations and times |
| `slurm` | Logical indicating the use of rSLURM to submit job(s). When using SLURM, a `<stilt_wd>/_rslurm` directory is created to contain the SLURM submission scripts and node-specific log files. |
| `slurm_options` | Named list of options passed to `sbatch` using `rslurm::slurm_apply()`. This typically includes `time`, `account`, and `partition` values |
| Arg | Description |
| -------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `n_nodes` | If using SLURM for job submission, number of nodes to utilize |
| `n_cores` | Number of cores per node to parallelize simulations by receptor locations and times |
| `processes_per_node` | Number of processes to run on each node. Can be set higher than n_cores for nodes which support [hyperthreading](https://scicomp.ethz.ch/wiki/Using_hyperthreading) |
| `slurm` | Logical indicating the use of rSLURM to submit job(s). When using SLURM, a `<stilt_wd>/_rslurm` directory is created to contain the SLURM submission scripts and node-specific log files. |
| `slurm_options` | Named list of options passed to `sbatch` using `rslurm::slurm_apply()`. This typically includes `time`, `account`, and `partition` values |

### Receptor placement

Expand Down
2 changes: 1 addition & 1 deletion docs/execution.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Rscript r/run_stilt.r

![Parallel simulations with SLURM](static/img/chart-parallel.png)

If `slurm = TRUE` STILT will distribute the simulations across `n_nodes` using `n_cores` on each node (total parallel worker count is `n_nodes * n_cores`). This will create a `<stilt_wd>/_rslurm` directory which contains SLURM submission scripts and logs from each node.
If `slurm = TRUE` STILT will distribute the simulations across `n_nodes` using `n_cores` on each node (total parallel worker count is `n_nodes * n_cores`). This will create a `<stilt_wd>/_rslurm` directory which contains SLURM submission scripts and logs from each node. For nodes which support [hyperthreading](https://scicomp.ethz.ch/wiki/Using_hyperthreading), the job allocation per node can be increased beyond the number of cores per node via `processes_per_node`.

```bash
Rscript r/run_stilt.r
Expand Down
2 changes: 2 additions & 0 deletions r/run_stilt.r
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ lib.loc <- .libPaths()[1]
# Parallel simulation settings
n_cores <- 1
n_nodes <- 1
processes_per_node <- n_cores
slurm <- n_nodes > 1
slurm_options <- list(
time = '300:00:00',
Expand Down Expand Up @@ -188,6 +189,7 @@ stilt_apply(FUN = simulation_step,
slurm_options = slurm_options,
n_cores = n_cores,
n_nodes = n_nodes,
processes_per_node = processes_per_node,
before_footprint = list(before_footprint),
before_trajec = list(before_trajec),
lib.loc = lib.loc,
Expand Down
6 changes: 5 additions & 1 deletion r/src/stilt_apply.r
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
#' passed to rslurm::slurm_apply()
#' @param n_nodes number of nodes to submit SLURM jobs to using \code{sbatch}
#' @param n_cores number of CPUs to utilize per node
#' @param processes_per_node number of processes to run per node. Can be set
#' higher than n_cores for nodes which support hyperthreading
#' @param ... arguments to FUN
#'
#' @return if using slurm, returns sjob information. Otherwise, will return a
Expand All @@ -19,7 +21,8 @@
#' @export

stilt_apply <- function(FUN, slurm = F, slurm_options = list(),
n_nodes = 1, n_cores = 1, ...) {
n_nodes = 1, n_cores = 1, processes_per_node = n_cores,
...) {

if (!slurm && n_nodes > 1) {
stop('n_nodes > 1 but but slurm is disabled. ',
Expand Down Expand Up @@ -53,6 +56,7 @@ stilt_apply <- function(FUN, slurm = F, slurm_options = list(),
jobname = basename(getwd()), pkgs = 'base',
nodes = n_nodes,
cpus_per_node = n_cores,
processes_per_node = processes_per_node,
preschedule_cores = F,
slurm_options = slurm_options)
return(invisible(sjob))
Expand Down
Loading