Skip to content

Commit

Permalink
more
Browse files Browse the repository at this point in the history
  • Loading branch information
alefisico committed Nov 14, 2024
1 parent ec10cc1 commit 37028e2
Showing 1 changed file with 87 additions and 3 deletions.
90 changes: 87 additions & 3 deletions episodes/REANA.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ While REANA is primarily a tool for reproducible analysis, it effectively functi
REANA offers flexibility in workflow management by supporting multiple systems like [CWL](https://www.commonwl.org/), [Serial](https://docs.reana.io/running-workflows/supported-systems/serial/), [Yadage](https://yadage.readthedocs.io/en/latest/), and Snakemake. While there's a growing adoption of Snakemake within the LHC community due to its large external user base and strong support, REANA remains agnostic to the chosen workflow system. However, due to its popularity and powerful features, we will primarily focus on Snakemake throughout this tutorial.


### A REANA workflow
### Defining a workflow for REANA

While a Snakemake worflow is defined in a snakefile, in REANA one needs to create a REANA file to include all the parameters that the snakemake workflow will need.

Expand Down Expand Up @@ -66,9 +66,93 @@ The `reana.yaml` file acts as a blueprint for your REANA workflow, defining esse
- `type`: we are using `snakemake`, but REANA supports CWL, Serial or Yadage. ([More here](https://docs.reana.io/running-workflows/supported-systems/)).
- `file`: This defines the location of your workflow script.
- `resources`: Define any **global** resources required for your workflow execution (detailed information available at [here](https://docs.reana.io/advanced-usage/)). Remember that you can also define dedicated rule resources in the snakefile.

Check warning on line 68 in episodes/REANA.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[uninformative link text]: [here](https://docs.reana.io/advanced-usage/)
- `workspace`: some useful options can be included here, like how many days a user wants a specific folder to be retain (`retention_days`)
- `outputs`: this is mandatory, and tells REANA which files or folders can be directly download once the workflow runs. It can be files or directories.
- `workspace` (optional): Here, you can set options like `retention_days` to specify how long specific folders should be retained after workflow completion.
- `outputs`: This section informs REANA which files or folders should be made available for download after successful workflow execution. These can be individual files or entire directories.

### Running a workflow in REANA

Let's get familiar with the steps necessary to run our workflow in REANA. First, activate the REANA environment, and then remember to set these variables:

```
export REANA_SERVER_URL=https://reana.cern.ch
export REANA_ACCESS_TOKEN=xxxxxxxxxxxxxxxxxxxxxxx
```

This needs to be done every time you start a session. Then, the REANA client contains a similar validation than Snakemake's dry-run, we can run:

```BASH
reana-client validate -f reana.yaml
```

```OUTPUT
Building DAG of jobs...
[WARNING] Building DAG of jobs...
Job stats:
job count
----------- -------
all 1
datacarding 1
skimming 1
total 3
[WARNING] Job stats:
job count
----------- -------
all 1
datacarding 1
skimming 1
total 3
==> Verifying REANA specification file... /srv/reana.yaml
-> SUCCESS: Valid REANA specification file.
==> Verifying REANA specification parameters...
-> SUCCESS: REANA specification parameters appear valid.
==> Verifying workflow parameters and commands...
-> SUCCESS: Workflow parameters and commands appear valid.
==> Verifying dangerous workflow operations...
-> WARNING: Operation "cd /" found in step "skimming" might be dangerous.
-> WARNING: Operation "cd /" found in step "datacarding" might be dangerous.
```

This step verifies first if the snakefile contains a workflow that can be run, and second it verifies that the inputs in the `reana.yaml` file are correct. If everything looks ok, we can create a workflow called `test_SUSY` within the platform:

```BASH
reana_client create -w test_SUSY -f reana.yaml
```

Remember that this step will only create the workflow within REANA, you can verify it by looking at [https://reana.cern.ch/](https://reana.cern.ch/) or by running:

```BASH
reana-client status -w test_SUSY
```

The next step is to upload the files the workflow needs:

```BASH
reana-client upload -w test_SUSY
```

and finally we can make it run:

```BASH
reana-client start -w test_SUSY
```

Again, you can check the status of your jobs via the REANA website or with `reana-client status -w test_SUSY`.


::::::::::::: challenge

### There must be a better way!

Absolutely there is!
While there can be specific circunstances where one can split these steps, there is a REANA command to create, upload and start your workflow. You can try:

```BASH
reana-client run -w test_SUSY -f reana.yaml
```

::::::::::::::::::::::::::::::::

:::::: keypoints
- keypoint 1
Expand Down

0 comments on commit 37028e2

Please sign in to comment.