Skip to content

Commit

Permalink
Merge branch 'main' into v2.19.0
Browse files Browse the repository at this point in the history
  • Loading branch information
rjmello authored Apr 25, 2024
2 parents d178102 + f43e024 commit d779b69
Show file tree
Hide file tree
Showing 13 changed files with 47 additions and 231 deletions.
Binary file removed docs/_static/images/31174D02-Cooley800.jpg
Binary file not shown.
Binary file removed docs/_static/images/ALCF-Theta_111016-1000px.jpg
Binary file not shown.
Binary file removed docs/_static/images/stampede2.jpg
Binary file not shown.
34 changes: 0 additions & 34 deletions docs/configs/cooley.yaml

This file was deleted.

11 changes: 8 additions & 3 deletions docs/configs/expanse.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
display_name: SDSC Expanse
display_name: Expanse@SDSC

engine:
type: HighThroughputEngine
type: GlobusComputeEngine
max_workers_per_node: 2
worker_debug: False

Expand All @@ -25,7 +26,11 @@ engine:
# e.g., "module load anaconda3; source activate gce_env"
worker_init: {{ COMMAND }}

init_blocks: 1
# Command to be run before starting a worker
# e.g., "module load anaconda3; source activate gce_env"
worker_init: "source ~/setup.sh"

init_blocks: 0
min_blocks: 0
max_blocks: 1

Expand Down
4 changes: 3 additions & 1 deletion docs/configs/frontera.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
display_name: Frontera@TACC

engine:
type: HighThroughputEngine
type: GlobusComputeEngine
max_workers_per_node: 2
worker_debug: False

Expand Down
2 changes: 2 additions & 0 deletions docs/configs/midway.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
display_name: [email protected]

engine:
type: HighThroughputEngine
max_workers_per_node: 2
Expand Down
11 changes: 5 additions & 6 deletions docs/configs/perlmutter.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
display_name: Permutter@NERSC
engine:
type: HighThroughputEngine
type: GlobusComputeEngine
worker_debug: False

strategy:
type: SimpleStrategy
max_idletime: 300

address:
type: address_by_interface
ifname: hsn0

provider:
type: SlurmProvider
partition: debug

# We request all hyperthreads on a node.
# GPU nodes have 128 threads, CPU nodes have 256 threads
Expand All @@ -21,8 +19,9 @@ engine:

# string to prepend to #SBATCH blocks in the submit
# script to the scheduler
# For GPUs in the debug qos eg: "#SBATCH --constraint=gpu -q debug"
# For GPUs in the debug qos eg: "#SBATCH --constraint=gpu"
scheduler_options: {{ OPTIONS }}

# Your NERSC account, eg: "m0000"
account: {{ NERSC_ACCOUNT }}

Expand Down
60 changes: 29 additions & 31 deletions docs/configs/polaris.yaml
Original file line number Diff line number Diff line change
@@ -1,40 +1,38 @@
engine:
type: HighThroughputEngine
max_workers_per_node: 1
display_name: Polaris@ALCF

# Un-comment to give each worker exclusive access to a single GPU
# available_accelerators: 4
engine:
type: GlobusComputeEngine
max_workers_per_node: 4

strategy:
type: SimpleStrategy
max_idletime: 300
# Un-comment to give each worker exclusive access to a single GPU
# available_accelerators: 4

address:
type: address_by_interface
ifname: bond0
address:
type: address_by_interface
ifname: bond0

provider:
type: PBSProProvider
provider:
type: PBSProProvider

launcher:
type: MpiExecLauncher
# Ensures 1 manger per node, work on all 64 cores
bind_cmd: --cpu-bind
overrides: --depth=64 --ppn 1
launcher:
type: MpiExecLauncher
# Ensures 1 manger per node, work on all 64 cores
bind_cmd: --cpu-bind
overrides: --depth=64 --ppn 1

account: {{ YOUR_POLARIS_ACCOUNT }}
queue: preemptable
cpus_per_node: 32
select_options: ngpus=4
account: {{ YOUR_POLARIS_ACCOUNT }}
queue: debug-scaling
cpus_per_node: 32
select_options: ngpus=4

# e.g., "#PBS -l filesystems=home:grand:eagle\n#PBS -k doe"
scheduler_options: "#PBS -l filesystems=home:grand:eagle"
# e.g., "#PBS -l filesystems=home:grand:eagle\n#PBS -k doe"
scheduler_options: "#PBS -l filesystems=home:grand:eagle"

# Node setup: activate necessary conda environment and such
worker_init: {{ COMMAND }}
# Node setup: activate necessary conda environment and such
worker_init: {{ COMMAND }}

walltime: 01:00:00
nodes_per_block: 1
init_blocks: 0
min_blocks: 0
max_blocks: 2
walltime: 01:00:00
nodes_per_block: 1
init_blocks: 0
min_blocks: 0
max_blocks: 2
35 changes: 0 additions & 35 deletions docs/configs/stampede2.yaml

This file was deleted.

37 changes: 0 additions & 37 deletions docs/configs/theta.yaml

This file was deleted.

42 changes: 0 additions & 42 deletions docs/configs/theta_singularity.yaml

This file was deleted.

42 changes: 0 additions & 42 deletions docs/configuring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,37 +108,6 @@ The KubernetesProvider exploits the Python Kubernetes API, which assumes that yo
:language: yaml


Theta (ALCF)
^^^^^^^^^^^^

.. image:: _static/images/ALCF-Theta_111016-1000px.jpg

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility's
**Theta** supercomputer. This example uses the ``HighThroughputEngine`` and connects to Theta's Cobalt scheduler
using the ``CobaltProvider``. This configuration assumes that the script is being executed on the login nodes of Theta.

.. literalinclude:: configs/theta.yaml
:language: yaml

The following configuration is an example to use singularity container on Theta.

.. literalinclude:: configs/theta_singularity.yaml
:language: yaml


Cooley (ALCF)
^^^^^^^^^^^^^

.. image:: _static/images/31174D02-Cooley800.jpg

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility's
**Cooley** cluster. This example uses the ``HighThroughputEngine`` and connects to Cooley's Cobalt scheduler
using the ``CobaltProvider``. This configuration assumes that the script is being executed on the login nodes of Cooley.

.. literalinclude:: configs/cooley.yaml
:language: yaml


Polaris (ALCF)
^^^^^^^^^^^^^^

Expand Down Expand Up @@ -199,17 +168,6 @@ running on a login node, uses the ``SlurmProvider`` to interface with the schedu
.. literalinclude:: configs/bridges-2.yaml
:language: yaml

Stampede2 (TACC)
^^^^^^^^^^^^^^^^

.. image:: _static/images/stampede2.jpg

The following snippet shows an example configuration for accessing the Stampede2 system at the Texas Advanced Computing Center (TACC).
The configuration below assumes that the user is running on a login node, uses the ``SlurmProvider`` to interface with the scheduler,
and uses the ``SrunLauncher`` to launch workers.

.. literalinclude:: configs/stampede2.yaml
:language: yaml

FASTER (TAMU)
^^^^^^^^^^^^^
Expand Down

0 comments on commit d779b69

Please sign in to comment.