-
Notifications
You must be signed in to change notification settings - Fork 0
Lab 03: Parsing Nextflow Output
What happened after running the exercise pipeline? We should have seen output in the shell that looks something like the following:
N E X T F L O W ~ version 21.04.3
Launching `./pipeline/main.nf` [fabulous_bernard] - revision: 5dfc860211
executor > local (4)
[3e/138820] process > WRITE_ODD (4) [100%] 4 of 4 ✔
This gives us:
- Nextflow version information
- Name of the pipeline and a unique identifier for the run
- The executor being used. In this case it is being run directly in the current interactive environment so it it local.
- The most recent location in the "work" output directory where a given process is occurring. Here we see that WRITE_ODD has finished running all 4 of the elements in the ch_odd queue channel.
We can check to see what's changed in the output by running:
ls -a
...which should yield
. .. .nextflow .nextflow.log pipeline report.html run_01.sh timeline.html work
The ".nextflow" directory contains cached information and history about previous runs of this pipeline. The log file ".nextflow.log" has detailed information about the run and is one of a few critical files for debugging. Both "report.html" and "timeline.html" contain the summary information we requested in the run command earlier. Finally, the "work" directory contains all of the intermediate files and output from the pipeline. It will be good to understand what is contained in the work directory, but in general it is not always a fun place to explore.
As the Nextflow pipeline runs, the work directory is populated with 2 character subdirectories with hexadecimal naming (don't worry). Each of those subdirectories contains one or more nested subdirectories with very long strings such as "5d8ebc456a217b5483706f494ff611" (don't worry). Within each of these nested subdirectories is the actual output from our previous WRITE_ODD process.
Let's see the files (you can go to exercise 03_parsing_output
for matching output):
ls work/*/*
...which yields
work/28/5d8ebc456a217b5483706f494ff611:
1.txt
work/29/d59f5d5bc2f787e50b8442ee7c75c4:
3.txt
work/41/83b3062b0d4389697af9068e0b4fc2:
7.txt
work/f5/bc61f8bb9648daa4b8847cc5b48527:
5.txt
Based on the naming of your output work directory, pick a nested subdirectory and cd into it.
03_parsing_output
, just pick one for now.
cd work/41/83b3062b0d4389697af9068e0b4fc2
ls -a
yielding
. .. 7.txt .command.begin .command.err .command.log .command.out .command.run .command.sh .command.trace .exitcode
In addition to the output ".txt" file we created inside the pipeline, there are also many hidden files Nextflow used to run and log this particular step. Notably, ".command.sh" contains the actual commands defined in our script. Although they are empty in this exercise, the ".command.err", ".command.log", and ".command.out" may be critical in debugging failures down the road.