Apps may be able to identify if the input dataset is handled with
+DataLad or Git-Annex, and pull down linked data that has not
+been fetched yet.
+One example of one such application is MRIQC, and all the examples
+on this documentation page will refer to it.
+
+
Summary
+
Executing BIDS-Apps leveraging DataLad-controlled datasets
+within containers can be tricky.
+In particular, one of our general recommendations involves mounting
+or binding folders into the container in read-only mode, which
+will disallow DataLad from writing to the dataset tree.
+Similarly, and depending on the specific runtime settings of the
+container framework, DataLad may encounter issues with file ownership too.
+This section guides users through ensuring smooth execution of
+BIDS-Apps on DataLad/Git-annex-managed datasets.
When executing MRIQC within Docker on a DataLad dataset
+(for instance, installed from OpenNeuro),
+we will need to ensure the following settings are observed:
+
+
the user id (uid) who installed the DataLad dataset must match
+ the uid who is executing MRIQC within the container runtime
+
the uid who is executing MRIQC within the container must
+ have sufficient permissions to write in the tree.
If the uid is not correct, we will likely encounter the following error:
+
datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false -c annex.merge-annex-branches=false annex find --not --in . --json --json-error-messages -c annex.dotfiles=true -- sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz sub-0002/func/sub-0002_task-emomatching_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-restingstate_acq-mb3_bold.nii.gz sub-0001/func/sub-0001_task-emomatching_acq-seq_bold.nii.gz sub-0001/func/sub-0001_task-faces_acq-mb3_bold.nii.gz sub-0001/dwi/sub-0001_dwi.nii.gz sub-0002/func/sub-0002_task-workingmemory_acq-seq_bold.nii.gz sub-0001/anat/sub-0001_T1w.nii.gz sub-0002/anat/sub-0002_T1w.nii.gz sub-0001/func/sub-0001_task-gstroop_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-faces_acq-mb3_bold.nii.gz sub-0002/func/sub-0002_task-anticipation_acq-seq_bold.nii.gz sub-0002/dwi/sub-0002_dwi.nii.gz sub-0001/func/sub-0001_task-anticipation_acq-seq_bold.nii.gz sub-0001/func/sub-0001_task-workingmemory_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-gstroop_acq-seq_bold.nii.gz' failed with exitcode 1 under /data [info keys: stdout_json] [err: 'git-annex: Git refuses to operate in this repository, probably because it is owned by someone else.
+
+To add an exception for this directory, call:
+git config --global --add safe.directory /data
+
+git-annex: automatic initialization failed due to above problems']
+
+
Confusingly, following the suggestion from DataLad directly on the host
+(git config --global --add safe.directory /data) will not work in this
+case, because this line must be executed within the container.
+
Instead, we can override the default user executing within the container
+(which is root, or uid = 0).
+This can be achieved with
+Docker's -u/--user option:
We can combine this option with Bash's id command to ensure the current user's uid and group id (gid) are being set.
+Let's update the last example in the previous
+Docker execution section:
The above command line will ensure MRIQC to be executed with the current
+uid and gid, which will match the filesystem's permissions if the dataset
+was installed with the same user.
+
+
Match uid and gid with those corresponding to the user who installed the dataset
+
When different users are to install the dataset and
+execute the application, Docker must be executed with the
+uid and gid corresponding to the user who installed the dataset.
+The uid corresponding to a given username (for instance janedoe)
+can be obtained as follows:
+
getent passwd "janedoe" | cut -f 3 -d ":"
+
+
and her gid:
+
getent passwd "janedoe" | cut -f 4 -d ":"
+
+
+
Mounting the dataset folder without read-only permissions¶
+
If the dataset is protected with read-only permissions, then MRIQC
+will hit the following error
+(see nipreps/mriqc#1363):
+
get(error): sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz (file) [git-annex: .git/annex/tmp: createDirectory: permission denied (Read-only file system)]
+action summary:
+ get (error: 1)
+Traceback (most recent call last):
+ File "/opt/conda/bin/mriqc", line 8, in <module>
+ sys.exit(main())
+ ^^^^^^
+ File "/opt/conda/lib/python3.11/site-packages/mriqc/cli/run.py", line 43, in main
+ parse_args(argv)
+ File "/opt/conda/lib/python3.11/site-packages/mriqc/cli/parser.py", line 658, in parse_args
+ initialize_meta_and_data()
+ File "/opt/conda/lib/python3.11/site-packages/mriqc/utils/misc.py", line 447, in initialize_meta_and_data
+ _datalad_get(dataset)
+ File "/opt/conda/lib/python3.11/site-packages/mriqc/utils/misc.py", line 282, in _datalad_get
+ return get(
+ ^^^^
+ File "/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py", line 773, in eval_func
+ return return_func(*args, **kwargs)
+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ File "/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py", line 763, in return_func
+ results = list(results)
+ ^^^^^^^^^^^^^
+ File "/opt/conda/lib/python3.11/site-packages/datalad_next/patches/interface_utils.py", line 287, in _execute_command_
+ raise IncompleteResultsError(
+datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:
+[{'action': 'get',
+ 'annexkey': 'MD5E-s76037251--344f061a3165c71e36b98ad1649c3c8c.nii.gz',
+ 'error_message': 'git-annex: .git/annex/tmp: createDirectory: permission '
+ 'denied (Read-only file system)',
+ 'path': '/data/sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz',
+ 'refds': '/data',
+ 'status': 'error',
+ 'type': 'file'}]
+
+
This error indicates that the container is executed with
+the appropriate uid and gid pair.
+In this case, we will need to ensure DataLad can write
+to the dataset installation when obtaining new data.
+This is easily achieved by removing the read-only parameters of the
+mount option:
+
$dockerrun-ti--rm\
+-v$HOME/ds002785:/data\ # mount data WITHOUT :ro
+-v$HOME/ds002785/derivatives:/out\
+-v$HOME/tmp/ds002785-workdir:/work\
+-u$(id-u):$(id-g)\ # set execution uid:gid
+nipreps/mriqc:<latest-version>\
+\
+/data/out/mriqc-<latest-version>\
+participant\
+-w/work
+
In the case of Singularity and Apptainer, ensuring the uid that
+executes the container involves using user namespace mappings.
+Therefore, you will need to contact your system administrator to figure
+out a convenient solution to the problem.
+
Since most of Singularity/Apptainer deployments automatically bind
+the user's $HOME directory, DataLad's suggested direction may
+work:
Allowing the container to write on the dataset's tree is straightforward
+and homologous to Docker, by removing the :ro setting in the binding
+option (-B).
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/apps/docker/index.html b/apps/docker/index.html
index ed94410..a8c4bfe 100644
--- a/apps/docker/index.html
+++ b/apps/docker/index.html
@@ -691,6 +691,27 @@
+
+
+
+
+
+
+
NiPreps augment the scanner to produce data directly consumable by analyses.
We refer to data directly consumable by analyses as analysis-grade data by analogy with the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are products that have been:
minimally preprocessed, but are
safe to consume directly.
"},{"location":"#building-on-the-success-story-of-fmriprep","title":"Building on the success story of fMRIPrep","text":"
NiPreps were conceived as a generalization of fMRIPrep across new modalities, populations, cohorts, and species. fMRIPrep is widely adopted, as our telemetry with Sentry (and now, in-house with migas) shows:
fMRIPrep is executed an average of 9,500 times every week, of which, around 7,000 times it finishes successfully (72.9% success rate). The average number of executions started includes debug and dry runs where researchers do not intend actually process data. Therefore, the effective (that is, discarding test runs) success ratio of fMRIPrep is likely higher."},{"location":"apps/docker/","title":"Executing with Docker","text":"
Summary
Here, we describe how to run NiPreps with Docker containers. To illustrate the process, we will show the execution of fMRIPrep, but these guidelines extend to any other end-user NiPrep.
"},{"location":"apps/docker/#before-you-start-install-docker","title":"Before you start: install Docker","text":"
Probably, the most popular framework to execute containers is Docker. If you are to run a NiPrep on your PC/laptop, this is the RECOMMENDED way of execution. Please make sure you follow the Docker installation instructions. You can check your Docker Runtime installation running their hello-world image:
$ docker run --rm hello-world\n
If you have a functional installation, then you should obtain the following output:
Hello from Docker!\nThis message shows that your installation appears to be working correctly.\n\nTo generate this message, Docker took the following steps:\n 1. The Docker client contacted the Docker daemon.\n 2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n (amd64)\n 3. The Docker daemon created a new container from that image which runs the\n executable that produces the output you are currently reading.\n 4. The Docker daemon streamed that output to the Docker client, which sent it\n to your terminal.\n\nTo try something more ambitious, you can run an Ubuntu container with:\n $ docker run -it ubuntu bash\n\nShare images, automate workflows, and more with a free Docker ID:\n https://hub.docker.com/\n\nFor more examples and ideas, visit:\n https://docs.docker.com/get-started/\n
After checking your Docker Engine is capable of running Docker images, you are ready to pull your first NiPreps container image.
For every new version of the particular NiPrep app that is released, a corresponding Docker image is generated. The Docker image becomes a container when the execution engine loads the image and adds an extra layer that makes it runnable. In order to run NiPreps Docker images, the Docker Runtime must be installed.
Taking fMRIPrep to illustrate the usage, first you might want to make sure of the exact version of the tool to be used:
$ docker pull nipreps/fmriprep:<latest-version>\n
You can run NiPreps interacting directly with the Docker Engine via the docker run interface.
"},{"location":"apps/docker/#running-a-niprep-with-a-lightweight-wrapper","title":"Running a NiPrep with a lightweight wrapper","text":"
Some NiPreps include a lightweight wrapper script for convenience. That is the case of fMRIPrep and its fmriprep-docker wrapper. Before starting, make sure you have the wrapper installed. When you run fmriprep-docker, it will generate a Docker command line for you, print it out for reporting purposes, and then execute it without further action needed, e.g.:
fmriprep-docker implements the unified command-line interface of BIDS Apps, and automatically translates directories into Docker mount points for you.
We have published a step-by-step tutorial illustrating how to run fmriprep-docker. This tutorial also provides valuable troubleshooting insights and advice on what to do after fMRIPrep has run.
"},{"location":"apps/docker/#running-a-niprep-directly-interacting-with-the-docker-engine","title":"Running a NiPrep directly interacting with the Docker Engine","text":"
If you need a finer control over the container execution, or you feel comfortable with the Docker Engine, avoiding the extra software layer of the wrapper might be a good decision.
Accessing filesystems in the host within the container: Containers are confined in a sandbox, so they can't access the host in any ways unless you explicitly prescribe acceptable accesses to the host. The Docker Engine provides mounting filesystems into the container with the -v argument and the following syntax: -v some/path/in/host:/absolute/path/within/container:ro, where the trailing :ro specifies that the mount is read-only. The mount permissions modifiers can be omitted, which means the mount will have read-write permissions. In general, you'll want to at least provide two mount-points: one set in read-only mode for the input data and one read/write to store the outputs. Potentially, you'll want to provide one or two more mount-points: one for the working directory, in case you need to debug some issue or reuse pre-cached results; and a TemplateFlow folder to preempt the download of your favorite templates in every run.
Running containers as a user: By default, Docker will run the container as root. Some share systems my limit this feature and only allow running containers as a user. When the container is run as root, files written out to filesystems mounted from the host will have the user id 1000 by default. In other words, you'll need to be able to run as root in the host to change permissions or manage these files. Alternatively, running as a user allows preempting these permissions issues. It is possible to run as a user with the -u argument. In general, we will want to use the same user ID as the running user in the host to ensure the ownership of files written during the container execution. Therefore, you will generally run the container with -u $( id -u ).
Once the Docker Engine arguments are written, the remainder of the command line follows the usage. In other words, the first section of the command line is all equivalent to the fmriprep executable in a bare-metal installation: :
$ docker run -ti --rm \\ # These lines\n -v $HOME/ds005:/data:ro \\ # are equivalent to\n -v $HOME/ds005/derivatives:/out \\ # a call to the App's\n -v $HOME/tmp/ds005-workdir:/work \\ # entry-point.\n nipreps/fmriprep:<latest-version> \\ #\n \\\n /data /out/fmriprep-<latest-version> \\ # These lines correspond\n participant \\ # to the particular BIDS\n -w /work # App arguments.\n
"},{"location":"apps/framework/","title":"Introduction","text":""},{"location":"apps/framework/#what-is-bids","title":"What is BIDS?","text":"
The Brain Imaging Data Structure (BIDS) is a standard for organizing and describing brain datasets, including MRI. The common naming convention and folder structure allow researchers to easily reuse BIDS datasets, re-apply analysis protocols, and run standardized automatic data preprocessing pipelines (and particularly, BIDS Apps). The BIDS starter-kit contains a wide collection of educational resources. Validity of the structure can be assessed with the online BIDS-Validator. The tree of a typical, valid (BIDS-compliant) dataset is shown below:
"},{"location":"apps/framework/#what-is-a-bids-app","title":"What is a BIDS App?","text":"
(Taken from the BIDS Apps paper)
A BIDS App is a container image capturing a neuroimaging pipeline that takes a BIDS-formatted dataset as input. Since the input is a whole dataset, apps are able to combine multiple modalities, sessions, and/or subjects, but at the same time need to implement ways to query input datasets. Each BIDS App has the same core set of command-line arguments, making them easy to run and integrate into automated platforms. BIDS Apps are constructed in a way that does not depend on any software outside of the container image other than the container engine.
BIDS Apps rely upon two technologies for container computing:
Docker \u2014 for building, hosting as well as running containers on local hardware (running Windows, Mac OS X or Linux) or in the cloud.
Singularity \u2014 for running containers on HPCs (high-performance computing).
BIDS Apps are deposited in the Docker Hub repository, making them openly accessible. Each app is versioned and all of the historical versions are available to download. By reporting the BIDS App name and version in a manuscript, authors can provide others with the ability to exactly replicate their analysis workflow.
Docker is used for its excellent documentation, maturity, and the Docker Hub service for storage and distribution of the images. Docker containers are easily run on personal computers and cloud services. However, the Docker Runtime was originally designed to run different components of web services (HTTP servers, databases etc.) using cloud resources. Docker thus requires root or root-like permissions, as well as modern versions of Linux kernel (to perform user mapping and management of network resources); though this is not a problem in context of renting cloud resources (which are not shared with other users), it makes it difficult or impossible to use in a multi-tenant environment such as an HPC system, which is often the most cost-effective computational resource available to researchers.
Singularity, on the other hand, is a unique container technology designed from the ground up with the encapsulation of binary dependencies and HPC use in mind. Its main advantage over Docker is that it does not require root access for container execution and thus is safe to use on multi-tenant systems. In addition, it does not require recent Linux kernel functionalities (such as namespaces, cgroups and capabilities), making it easy to install on legacy systems.
BIDS Apps decouple the individual level analysis (processing of independent subjects) from group-level analyses aggregating participants. For the analysis of individual subjects, Apps need to understand the BIDS structure of the input dataset, so that the required inputs for the designated subject are found. Apps are designed to easily process derivatives generated by the participant-level or other Apps. The overall workflow has an entry-point and an end-point responsible of setting-up the map-reduce tasks and the tear-down including organizing the outputs for its archiving, respectively. Each App may implement multiple map and reduce steps.
To improve user experience and ability to integrate BIDS Apps into various computational platforms, each App follows a set of core command-line arguments:
In this case, we have selected to run the participant level (to process individual subjects). fMRIPrep does not have a group level, but other BIDS Apps may have. For instance, MRIQC generates group-level reports with the following command-line:
"},{"location":"apps/framework/#what-are-bids-derivatives","title":"What are BIDS Derivatives?","text":"
NiPreps generate derivatives of the original data, and they fulfill the BIDS specification for the results of Apps that are created for subsequent consumption by other BIDS-Apps. These derivatives must follow the BIDS Derivatives specification (draft). An example of BIDS Derivatives filesystem tree, generated with fMRIPrep 1.5:
"},{"location":"apps/singularity/","title":"Executing with Singularity","text":"
Summary
Here, we describe how to run NiPreps with Singularity containers. To illustrate the process, we will show the execution of fMRIPrep, but these guidelines extend to any other end-user NiPrep.
"},{"location":"apps/singularity/#preparing-a-singularity-image","title":"Preparing a Singularity image","text":"
Singularity version >= 2.5: If the version of Singularity installed on your HPC (High-Performance Computing) system is modern enough you can create Singularity image directly on the system. This is as simple as:
where <version> should be replaced with the desired version of fMRIPrep that you want to download.
Singularity version < 2.5: In this case, start with a machine (e.g., your personal computer) with Docker installed. Use docker2singularity to create a singularity image. You will need an active internet connection and some time:
Singularity by default exposes all environment variables from the host inside the container. Because of this, your host libraries (e.g., NiPype or a Python environment) could be accidentally used instead of the ones inside the container. To avoid such a situation, we strongly recommend using the --cleanenv argument in all scenarios. For example:
Alternatively, conflicts might be preempted and some problems mitigated by unsetting potentially problematic settings, such as the PYTHONPATH variable, before running:
It is possible to define environment variables scoped within the container by using the SINGULARITYENV_* magic, in combination with --cleanenv. For example, we can set the FreeSurfer license variable (see fMRIPrep's documentation on this) as follows: :
As we can see, the export in the first line tells Singularity to set a corresponding environment variable of the same name after dropping the prefix SINGULARITYENV_.
"},{"location":"apps/singularity/#accessing-the-hosts-filesystem","title":"Accessing the host's filesystem","text":"
Depending on how Singularity is configured on your cluster it might or might not automatically bind (mount or expose) host's folders to the container (e.g., /scratch, or $HOME). This is particularly relevant because, if you can't run Singularity in privileged mode (which is almost certainly true in all the scenarios), Singularity containers are read only. This is to say that you won't be able to write anything unless Singularity can access the host's filesystem in write mode.
By default, Singularity automatically binds (mounts) the user's home directory and a scratch directory. In addition, Singularity generally allows binding the necessary folders with the -B <host_folder>:<container_folder>[:<permissions>] Singularity argument. For example:
If your Singularity installation doesn't allow you to bind non-existent bind points, you'll get an error saying WARNING: Skipping user bind, non existent bind point (directory) in container. In this scenario, you can either try to bind things onto some other bind point you know it exists in the image or rebuild your singularity image with docker2singularity as follows:
In the example above, the following bind points are created: /gpfs, /scratch, /work, /share, /opt/templateflow.
Important
One great feature of containers is their confinement or isolation from the host system. Binding mount points breaks this principle, as the container has now access to create changes in the host. Therefore, it is generally recommended to use binding scarcely and granting very limited access to the minimum necessary resources. In other words, it is preferred to bind just one subdirectory of $HOME than the full $HOME directory of the host (see nipreps/fmriprep#1778 (comment)).
Relevant aspects of the $HOME directory within the container: By default, Singularity will bind the user's $HOME directory in the host into the /home/$USER (or equivalent) in the container. Most of the times, it will also redefine the $HOME environment variable and update it to point to the corresponding mount point in /home/$USER. However, these defaults can be overwritten in your system. It is recommended to check your settings with your system's administrators. If your Singularity installation allows it, you can workaround the $HOME specification combining the bind mounts argument (-B) with the home overwrite argument (--home) as follows:
"},{"location":"apps/singularity/#templateflow-and-singularity","title":"TemplateFlow and Singularity","text":"
TemplateFlow is a helper tool that allows neuroimaging workflows to programmatically access a repository of standard neuroimaging templates. In other words, TemplateFlow allows NiPreps to dynamically change the templates that are used, e.g., in the atlas-based brain extraction step or spatial normalization.
Default settings in the Singularity image should get along with the Singularity installation of your system. However, deviations from the default configurations of your installation may break this compatibility. A particularly problematic case arises when the home directory is mounted in the container, but the $HOME environment variable is not correspondingly updated. Typically, you will experience errors like OSError: [Errno 30] Read-only file system or FileNotFoundError: [Errno 2] No such file or directory: '/home/fmriprep/.cache'.
If it is not explicitly forbidden in your installation, the first attempt to overcome this issue is manually setting the $HOME directory as follows:
$ singularity run --home $HOME --cleanenv fmriprep.simg <fmriprep arguments>\n
If the user's home directory is not automatically bound, then the second step would include manually binding it as in the section above: :
Finally, if the --home argument cannot be used, you'll need to provide the container with writable filesystems where TemplateFlow's files can be downloaded. In addition, you will need to indicate fMRIPrep to update the default paths with the new mount points setting the SINGULARITYENV_TEMPLATEFLOW_HOME variable. :
# Tell the NiPrep where TemplateFlow will place downloads\n$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/opt/templateflow\n$ singularity run -B <writable-path-on-host>:/opt/templateflow \\\n --cleanenv fmriprep.simg <fmriprep arguments>\n
"},{"location":"apps/singularity/#restricted-internet-access","title":"Restricted Internet access","text":"
We have identified several conditions in which running NiPreps might fail because of spotty or impossible access to Internet.
If your compute node cannot have access to Internet, then you'll need to pull down from TemplateFlow all the resources that will be necessary ahead of run-time.
If that is not the case (i.e., you should be able to hit HTTP/s endpoints), then you can try the following:
VerifiedHTTPSConnection ... Failed to establish a new connection: [Errno 110] Connection timed out. If you encounter an error like this, probably you'll need to set up an http proxy exporting SINGULARITYENV_http_proxy (see nipreps/fmriprep#1778 (comment). For example:
$ export SINGULARITYENV_https_proxy=http://<ip or proxy name>:<port>\n
requests.exceptions.SSLError: HTTPSConnectionPool .... In this case, your container seems to be able to reach the Internet, but unable to use SSL encryption. There are two potential solutions to the issue. The recommended one is setting REQUESTS_CA_BUNDLE to the appropriate path, and/or binding the appropriate filesystem:
Setting up a functional execution framework with Singularity might be tricky in some HPC (high-performance computing) systems. Please make sure you have read the relevant documentation of Singularity, and checked all the defaults and configuration in your system. The next step is checking the environment and access to fMRIPrep resources, using singularity shell.
Check access to input data folder, and BIDS validity:
$ singularity shell -B path/to/data:/data fmriprep.simg\nSingularity fmriprep.simg:~> ls /data\nCHANGES README dataset_description.json participants.tsv sub-01 sub-02 sub-03 sub-04 sub-05 sub-06 sub-07 sub-08 sub-09 sub-10 sub-11 sub-12 sub-13 sub-14 sub-15 sub-16 task-balloonanalogrisktask_bold.json\nSingularity fmriprep.simg:~> bids-validator /data\n 1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. (code: 13 - SLICE_TIMING_NOT_DEFINED)\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-04/func/sub-04_task-balloonanalogrisktask_run-01_bold.nii.gz\n ... and 38 more files having this issue (Use --verbose to see them all).\n Please visit https://neurostars.org/search?q=SLICE_TIMING_NOT_DEFINED for existing conversations about this issue.\n
Check access to output data folder, and whether you have write permissions
"},{"location":"apps/singularity/#running-singularity-on-a-slurm-system","title":"Running Singularity on a SLURM system","text":"
An example of sbatch script to run fMRIPrep on a SLURM system1 is given below. The submission script will generate one task per subject using a job array.
#!/bin/bash\n#\n#SBATCH -J fmriprep\n#SBATCH --time=48:00:00\n#SBATCH -n 1\n#SBATCH --cpus-per-task=16\n#SBATCH --mem-per-cpu=4G\n#SBATCH -p normal,mygroup # Queue names you can submit to\n# Outputs ----------------------------------\n#SBATCH -o log/%x-%A-%a.out\n#SBATCH -e log/%x-%A-%a.err\n#SBATCH --mail-user=%u@domain.tld\n#SBATCH --mail-type=ALL\n# ------------------------------------------\n\nBIDS_DIR=\"$STUDY/data\"\nDERIVS_DIR=\"derivatives/fmriprep-20.2.2\"\nLOCAL_FREESURFER_DIR=\"$STUDY/data/derivatives/freesurfer-6.0.1\"\n\n# Prepare some writeable bind-mount points.\nTEMPLATEFLOW_HOST_HOME=$HOME/.cache/templateflow\nFMRIPREP_HOST_CACHE=$HOME/.cache/fmriprep\nmkdir -p ${TEMPLATEFLOW_HOST_HOME}\nmkdir -p ${FMRIPREP_HOST_CACHE}\n\n# Prepare derivatives folder\nmkdir -p ${BIDS_DIR}/${DERIVS_DIR}\n\n# Make sure FS_LICENSE is defined in the container.\nexport SINGULARITYENV_FS_LICENSE=$HOME/.freesurfer.txt\n\n# Designate a templateflow bind-mount point\nexport SINGULARITYENV_TEMPLATEFLOW_HOME=\"/templateflow\"\nSINGULARITY_CMD=\"singularity run --cleanenv -B $BIDS_DIR:/data -B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} -B $L_SCRATCH:/work -B ${LOCAL_FREESURFER_DIR}:/fsdir $STUDY/images/fmriprep_20.2.2.simg\"\n\n# Parse the participants.tsv file and extract one subject ID from the line corresponding to this SLURM task.\nsubject=$( sed -n -E \"$((${SLURM_ARRAY_TASK_ID} + 1))s/sub-(\\S*)\\>.*/\\1/gp\" ${BIDS_DIR}/participants.tsv )\n\n# Remove IsRunning files from FreeSurfer\nfind ${LOCAL_FREESURFER_DIR}/sub-$subject/ -name \"*IsRunning*\" -type f -delete\n\n# Compose the command line\ncmd=\"${SINGULARITY_CMD} /data /data/${DERIVS_DIR} participant --participant-label $subject -w /work/ -vv --omp-nthreads 8 --nthreads 12 --mem_mb 30000 --output-spaces MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5 --use-aroma --fs-subjects-dir /fsdir\"\n\n# Setup done, run the command\necho Running task ${SLURM_ARRAY_TASK_ID}\necho Commandline: $cmd\neval $cmd\nexitcode=$?\n\n# Output results to a table\necho \"sub-$subject ${SLURM_ARRAY_TASK_ID} $exitcode\" \\\n >> ${SLURM_JOB_NAME}.${SLURM_ARRAY_JOB_ID}.tsv\necho Finished tasks ${SLURM_ARRAY_TASK_ID} with exit code $exitcode\nexit $exitcode\n
"},{"location":"assets/ORN-Workshop/presentation/#building-communities-around-reproducible-workflows","title":"Building communities around reproducible workflows","text":""},{"location":"assets/ORN-Workshop/presentation/#o-esteban","title":"O. Esteban","text":""},{"location":"assets/ORN-Workshop/presentation/#chuv-lausanne-university-hospital","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/ORN-Workshop/presentation/#wwwniprepsorg","title":"www.nipreps.org","text":"
]
layout: false count: false
.middle.center[
"},{"location":"assets/ORN-Workshop/presentation/#building-communities-around-reproducible-workflows_1","title":"Building communities around reproducible workflows","text":""},{"location":"assets/ORN-Workshop/presentation/#o-esteban_1","title":"O. Esteban","text":""},{"location":"assets/ORN-Workshop/presentation/#chuv-lausanne-university-hospital_1","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/ORN-Workshop/presentation/#wwwniprepsorg_1","title":"www.nipreps.org","text":"
]
???
"},{"location":"assets/ORN-Workshop/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
although many neuroimaging areas are still in search of methodological breakthroughs,
challenges have moved on to the workflows:
workflows within traditional toolboxes - usually not flexible to adapt to new data
BIDS and BIDS-Apps.
???
researchers have a large portfolio of image processing components readily available
toolboxes with great support and active maintenance:
"},{"location":"assets/ORN-Workshop/presentation/#new-questions-changing-the-focus","title":"New questions changing the focus:","text":""},{"location":"assets/ORN-Workshop/presentation/#-validity-does-the-workflow-actually-work-out","title":"- validity (does the workflow actually work out?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-transparency-is-it-a-black-box-how-precise-is-reporting","title":"- transparency (is it a black-box? how precise is reporting?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-vibration-how-each-tool-choice-parameters-affect-overall","title":"- vibration (how each tool choice & parameters affect overall?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-throughput-how-much-datatime-can-it-possible-take","title":"- throughput (how much data/time can it possible take?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-robustness-can-i-use-it-on-diverse-studies","title":"- robustness (can I use it on diverse studies?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-evaluation-what-is-it-unique-about-the-workflow-wrt-existing-alternatives","title":"- evaluation (what is it unique about the workflow, w.r.t. existing alternatives?)","text":""},{"location":"assets/ORN-Workshop/presentation/#the-garden-of-forking-paths","title":"The garden of forking paths","text":"
(Botvinik-Nezer et al., 2020)
Around 50% of teams used fMRIPrep'ed inputs.
"},{"location":"assets/ORN-Workshop/presentation/#the-fmriprep-story","title":"The fMRIPrep story","text":""},{"location":"assets/ORN-Workshop/presentation/#fmriprep-produces-analysis-ready-data-from-diverse-data","title":"fMRIPrep produces analysis-ready data from diverse data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
robust against inhomogeneity of data across studies
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/ORN-Workshop/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/ORN-Workshop/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
"},{"location":"assets/ORN-Workshop/presentation/#fmriprep-was-not-originally-envisioned-as-a-community-project","title":"fMRIPrep was not originally envisioned as a community project ...","text":"
(we just wanted a robust tool to automatically preprocess incoming data of OpenNeuro.org)
--
"},{"location":"assets/ORN-Workshop/presentation/#but-a-community-built-up-quickly-around-it","title":"... but a community built up quickly around it","text":"
Preprocessing of fMRI was in need for division of labor.
Obsession with transparency made early-adopters confident of the recipes they were applying.
Responsiveness to feedback. ]
.pull-right[
]
???
Preprocessing is a time-consuming effort, requires expertise converging imaging foundations & CS, typically addressed with legacy in-house pipelines.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
"},{"location":"assets/ORN-Workshop/presentation/#key-aspect-credit-all-direct-contributors","title":"Key aspect: credit all direct contributors","text":"
--
"},{"location":"assets/ORN-Workshop/presentation/#and-indirect-citation-boilerplate","title":".. and indirect: citation boilerplate.","text":""},{"location":"assets/ORN-Workshop/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/ORN-Workshop/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/ORN-Workshop/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/ORN-Workshop/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
"},{"location":"assets/ORN-Workshop/presentation/#image-processing-possible-guidelines-for-the-standardization-clinical-applications-j-veraart","title":"Image Processing: Possible Guidelines for the Standardization & Clinical Applications (J. Veraart)","text":"
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/ORN-Workshop/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/ORN-Workshop/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/ORN-Workshop/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/ORN-Workshop/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/ORN-Workshop/presentation/#nibabies-fmriprep-babies","title":"NiBabies | fMRIPrep-babies","text":"
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
"},{"location":"assets/ORN-Workshop/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
NMiND = NeverMIND, this Neuroimaging Method Is Not Duplicated
"},{"location":"assets/ORN-Workshop/presentation/#pis-worried-about-methodological-duplicity","title":"PIs worried about methodological duplicity","text":"
M. Milham, D. Fair, T. Satterthwaite, S. Ghosh, R. Poldrack, etc.
"},{"location":"assets/bhd2020/presentation/#nipreps-neuroimaging-preprocessing-tools","title":"NiPreps | NeuroImaging PREProcessing toolS","text":""},{"location":"assets/bhd2020/presentation/#o-esteban","title":"O. Esteban","text":""},{"location":"assets/bhd2020/presentation/#chuv-lausanne-university-hospital","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorgassetsbhd2020","title":"www.nipreps.org/assets/bhd2020","text":"
]
layout: false count: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#nipreps-neuroimaging-preprocessing-tools_1","title":"NiPreps | NeuroImaging PREProcessing toolS","text":""},{"location":"assets/bhd2020/presentation/#o-esteban_1","title":"O. Esteban","text":""},{"location":"assets/bhd2020/presentation/#chuv-lausanne-university-hospital_1","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorgassetsbhd2020_1","title":"www.nipreps.org/assets/bhd2020","text":"
]
???
"},{"location":"assets/bhd2020/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
"},{"location":"assets/bhd2020/presentation/#outlook","title":"Outlook","text":""},{"location":"assets/bhd2020/presentation/#1-understand-what-preprocessing-is-from-fmri","title":"1. Understand what preprocessing is - from fMRI","text":""},{"location":"assets/bhd2020/presentation/#2-the-fmriprep-experience","title":"2. The fMRIPrep experience","text":""},{"location":"assets/bhd2020/presentation/#3-the-dmriprep-experience","title":"3. The dMRIPrep experience","text":""},{"location":"assets/bhd2020/presentation/#4-importance-of-the-visual-reports","title":"4. Importance of the visual reports","text":""},{"location":"assets/bhd2020/presentation/#5-introducing-nipreps","title":"5. Introducing NiPreps","text":""},{"location":"assets/bhd2020/presentation/#6-open-forum-first-steps-and-contributing","title":"6. Open forum: first steps and contributing","text":""},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-nowadays","title":"The research workflow of functional MRI (nowadays)","text":"
(source: next slide)
"},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-2006","title":"The research workflow of functional MRI (2006)","text":"
(Strother, 2006; 10.1109/MEMB.2006.1607667)
"},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-ab","title":"The research workflow of functional MRI (a.B.*)","text":"
Adapted (Strother, 2006)
*a.B. = after BIDS (Brain Imaging Data Structure; Gorgolewski et al. (2016))
"},{"location":"assets/bhd2020/presentation/#neuroimaging-is-now-mature","title":"Neuroimaging is now mature","text":"
many excellent tools available (from specialized to foundational)
large toolboxes (AFNI, ANTs/ITK, FreeSurfer, FSL, Nilearn, SPM, etc.)
"},{"location":"assets/bhd2020/presentation/#bids-a-thrust-of-technology-driven-development","title":"BIDS - A thrust of technology-driven development","text":"
A uniform and complete interface to data:
Uniform: enables the workflow adapt to the data
Complete: enables validation and minimizes human-intervention
Extensible reproducibility:
BIDS-Derivatives
BIDS-Apps (Gorgolewski et al., 2017)
???
researchers have a large portfolio of image processing components readily available
toolboxes with great support and active maintenance:
"},{"location":"assets/bhd2020/presentation/#new-questions-changing-the-focus","title":"New questions changing the focus:","text":""},{"location":"assets/bhd2020/presentation/#-validity-does-the-workflow-actually-work-out","title":"- validity (does the workflow actually work out?)","text":""},{"location":"assets/bhd2020/presentation/#-transparency-is-it-a-black-box-how-precise-is-reporting","title":"- transparency (is it a black-box? how precise is reporting?)","text":""},{"location":"assets/bhd2020/presentation/#-vibration-how-each-tool-choice-parameters-affect-overall","title":"- vibration (how each tool choice & parameters affect overall?)","text":""},{"location":"assets/bhd2020/presentation/#-throughput-how-much-datatime-can-it-possible-take","title":"- throughput (how much data/time can it possible take?)","text":""},{"location":"assets/bhd2020/presentation/#-robustness-can-i-use-it-on-diverse-studies","title":"- robustness (can I use it on diverse studies?)","text":""},{"location":"assets/bhd2020/presentation/#-evaluation-what-is-it-unique-about-the-workflow-wrt-existing-alternatives","title":"- evaluation (what is it unique about the workflow, w.r.t. existing alternatives?)","text":""},{"location":"assets/bhd2020/presentation/#the-garden-of-forking-paths","title":"The garden of forking paths","text":"
(Botvinik-Nezer et al., 2020)
Around 50% of teams used fMRIPrep'ed inputs.
"},{"location":"assets/bhd2020/presentation/#the-fmriprep-story","title":"The fMRIPrep story","text":""},{"location":"assets/bhd2020/presentation/#fmriprep-produces-analysis-ready-data-from-diverse-data","title":"fMRIPrep produces analysis-ready data from diverse data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
robust against inhomogeneity of data across studies
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/bhd2020/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/bhd2020/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
"},{"location":"assets/bhd2020/presentation/#fmriprep-was-not-originally-envisioned-as-a-community-project","title":"fMRIPrep was not originally envisioned as a community project ...","text":"
(we just wanted a robust tool to automatically preprocess incoming data of OpenNeuro.org)
--
"},{"location":"assets/bhd2020/presentation/#but-a-community-built-up-quickly-around-it","title":"... but a community built up quickly around it","text":"
Preprocessing of fMRI was in need for division of labor.
Obsession with transparency made early-adopters confident of the recipes they were applying.
Responsiveness to feedback. ]
.pull-right[
]
???
Preprocessing is a time-consuming effort, requires expertise converging imaging foundations & CS, typically addressed with legacy in-house pipelines.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
"},{"location":"assets/bhd2020/presentation/#key-aspect-credit-all-direct-contributors","title":"Key aspect: credit all direct contributors","text":"
--
"},{"location":"assets/bhd2020/presentation/#and-indirect-citation-boilerplate","title":".. and indirect: citation boilerplate.","text":""},{"location":"assets/bhd2020/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/bhd2020/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/bhd2020/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/bhd2020/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
Joseph, M.; Pisner, D.; Richie-Halford, A.; Lerma-Usabiaga, G.; Keshavan, A.; Kent, JD.; Veraart, J.; Cieslak, M.; Poldrack, RA.; Rokem, A.; Esteban, O.
template: newsection layout: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#understanding-what-preprocessing-is-with-visual-reports","title":"Understanding what preprocessing is with visual reports","text":"
Let's walk through one example of report. Reports have several sections, starting with a summary indicating the particularities of this dataset and workflow choices made based on the input data.
The anatomical section follows with several visualizations to assess the anatomical processing steps mentioned before, spatial normalization to template spaces (the flickering panel helps assess alignment) and finally surface reconstruction.
Then, all functional runs are concatenated, and all show the same structure. After an initial summary of this particular run, the alignment to the same subject's anatomical image is presented, with contours of the white and pial surfaces as cues. Next panel shows the brain mask and ROIs utilized by the CompCor denoising. For each run we then find some visualizations to assess the generated confounding signals.
After all functional runs are presented, the About section keeps information to aid reproducibility of results, such as the software's version, or the exact command line run.
The boilerplate is found next, with a text version shown by default and tabs to convert to Markdown and LaTeX.
Reports conclude with a list of encountered errors (if any).
"},{"location":"assets/bhd2020/presentation/#reports-are-a-crucial-element-to-ensure-transparency","title":"Reports are a crucial element to ensure transparency","text":"
.pull-left[
]
.pull-right[
.distribute[ fMRIPrep generates one participant-wide report after execution.
Reports describe the data as found, and the steps applied (providing .blue[visual support to look inside the box]):
show researchers their data;
show how fMRIPrep interpreted the data (describing the actual preprocessing steps);
quality control of results, facilitating early error detection. ] ]
???
Therefore, reports have become a fundamental feature of fMRIPrep because they not only allow assessing the quality of the processing, but also provide an insight about the logic supporting such processing.
In other words, reports help respond to the what was done and the why was it done in addition to the how well it did.
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/bhd2020/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/bhd2020/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/bhd2020/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/bhd2020/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/bhd2020/presentation/#nibabies-fmriprep-babies","title":"NiBabies | fMRIPrep-babies","text":"
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
"},{"location":"assets/bhd2020/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
template: newsection layout: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#where-to-start","title":"Where to start?","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorg_1","title":"www.nipreps.org","text":""},{"location":"assets/bhd2020/presentation/#githubcomnipreps","title":"github.com/nipreps","text":"
"},{"location":"assets/torw2020/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
"},{"location":"assets/torw2020/presentation/#fmriprep-produces-analysis-ready-data-from-acquired-fmri-data","title":"fMRIPrep produces analysis-ready data from acquired (fMRI) data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/torw2020/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/torw2020/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
--
"},{"location":"assets/torw2020/presentation/#fmriprep-bundles-many-tools-afni-fsl-freesurfer-nilearn-etc","title":"fMRIPrep bundles many tools (AFNI, FSL, FreeSurfer, Nilearn, etc.)","text":"
(do not reinvent the wheel)
???
Finally, fMRIPrep sits on top of giants' shoulders: AFNI, FSL, FreeSurfer, Nilearn, etc. all implement methods very well backed-up and are thoroughly tested on their own.
"},{"location":"assets/torw2020/presentation/#we-started-fmriprep-in-february-2016","title":"We started fMRIPrep in February 2016","text":""},{"location":"assets/torw2020/presentation/#objectives","title":"Objectives:","text":"
Develop an fMRI preprocessing tool enforcing BIDS for the inputs
Automatically executable within OpenNeuro
"},{"location":"assets/torw2020/presentation/#initially-inspired-by-hcp-pipelines","title":"Initially inspired by HCP Pipelines","text":"
Problem: robustness vs. the wide variability of inputs
???
We began working on fMRIPrep back in 2016 with much more humble expectations: - We needed to develop an fMRI preprocessing tool leveraging BIDS - smart enough to adapt the workflow for the input dataset, - and the tool should be executable in OpenNeuro without human intervention.
Please note that at the time, the BIDS-Apps specification didn't exist yet.
We started out with an eye on HCP Pipelines, and soon identified that datasets in OpenNeuro varied extremely in terms of acquisition protocols and imaging parameters, which is definitely not a problem for HCP Pipelines, which has very specific requirements for the inputs.
"},{"location":"assets/torw2020/presentation/#fmriprep-adoption-and-popularization-brought-new-challenges","title":"fMRIPrep adoption and popularization brought new challenges","text":"
.pull-right[
]
???
With the fast adoption and popularization of fMRIPrep, new challenges surfaced.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
--
.pull-left[
"},{"location":"assets/torw2020/presentation/#transparency-was-addressed-with","title":"Transparency was addressed with:","text":"
the individual reports;
the thorough documentation; and
the citation boilerplate. ]
???
We realized that transparency is indeed a very hard problem. The first leg of our solution was the creation of a solid report system. fMRIPrep generates one individual report per participant, containing information not just to quality control the results, but also to understand the processing flow.
We also strived for a comprehensive, thorough documentation.
Finally, the so-called citation boilerplate appended to the individual reports describe the actual workflow that has been run, noting all the software that was applied including their versions and references.
--
.pull-left[
"},{"location":"assets/torw2020/presentation/#run-to-run-repeatability-is-an-open-issue","title":"Run-to-run repeatability is an open issue:","text":"
Reproducibility in terms of run-to-run repeatability of results become as a more apparent problem, and we are always trying to minimize the vibration caused by computational factors, software versions, etc.
massive amounts of bug reports, questioning the robustness
organic emergence of fMRIPrep enthusiasts (thanks to E. DuPre, JD. Kent) ]
???
We always maintained close attention to all the feedback channels. At some point we were washed over with bug reports that we needed to address. We also started to doubt the robustness against the variability of inputs, and set a thorough stress-test plan using data from OpenNeuro (reported in our Nat Meth paper). Among this feedback flooding, some external friends started to emerge and lent their shoulders in answering questions, fixing bugs, etc.
In particular, I want to thank Elizabeth DuPre (McGill) and James Kent (Univ. of Iowa) for being the earliest adopters and contributors.
"},{"location":"assets/torw2020/presentation/#fmriprep-is-stable-today-although-unfinished","title":"fMRIPrep is stable today, although unfinished","text":"
(Esteban et al., 2019)
???
These developments resulted in the following default processing workflow.
At the highest level, anatomical preprocessing (left-hand block) and functional preprocessing (right-hand block) can be clearly identified as the largest workflow units.
fMRIPrep combines all the anatomical images at the input in one anatomical reference, removes the intensity non-uniformity, delineates brain tissues, reconstructs surfaces, spatially normalizes the anatomical reference to one or more standard spaces.
On the functional pathway, a reference is calculated for further processes, then head-motion parameters are estimated (please note head-motion is accounted for in the last resampling step, in combination with other transforms), slice-timing correction is applied if requested.
Then, susceptibility distortion is estimated, if sufficient information (in terms of acquisition and metadata) is found in the BIDS structure.
Finally, data are mapped to the same individual's anatomical reference and outputs in the several output spaces requested are generated, along with a file gathering time-series of nuisance signals.
Let's walk through one example of report. Reports have several sections, starting with a summary indicating the particularities of this dataset and workflow choices made based on the input data.
The anatomical section follows with several visualizations to assess the anatomical processing steps mentioned before, spatial normalization to template spaces (the flickering panel helps assess alignment) and finally surface reconstruction.
Then, all functional runs are concatenated, and all show the same structure. After an initial summary of this particular run, the alignment to the same subject's anatomical image is presented, with contours of the white and pial surfaces as cues. Next panel shows the brain mask and ROIs utilized by the CompCor denoising. For each run we then find some visualizations to assess the generated confounding signals.
After all functional runs are presented, the About section keeps information to aid reproducibility of results, such as the software's version, or the exact command line run.
The boilerplate is found next, with a text version shown by default and tabs to convert to Markdown and LaTeX.
Reports conclude with a list of encountered errors (if any).
"},{"location":"assets/torw2020/presentation/#reports-are-a-crucial-element-to-ensure-transparency","title":"Reports are a crucial element to ensure transparency","text":"
.pull-left[
]
.pull-right[
.distribute[ fMRIPrep generates one participant-wide report after execution.
Reports describe the data as found, and the steps applied (providing .blue[visual support to look inside the box]):
show researchers their data;
show how fMRIPrep interpreted the data (describing the actual preprocessing steps);
quality control of results, facilitating early error detection. ] ]
???
Therefore, reports have become a fundamental feature of fMRIPrep because they not only allow assessing the quality of the processing, but also provide an insight about the logic supporting such processing.
In other words, reports help respond to the what was done and the why was it done in addition to the how well it did.
"},{"location":"assets/torw2020/presentation/#documentation-as-a-second-leg-of-transparency-fmripreporg","title":"Documentation as a second leg of transparency (fmriprep.org)","text":"
Hackathons & docu-sprints
the CompCor documentation example
.large[fmriprep.org]
???
We promptly identified the need for a very comprehensive documentation. The website at fmriprep.org covers a substantial area of how the tool works under the hood and how to best operate it.
The documentation turned out to be a great ice breaker for contributors, who have pushed forward fundamental sections of it.
Most of the largest increments in documentation are the result of discussions in hackathons, docusprints, neurostars, github, etc. A hallmark example was pull request 1877 by Karolina Finc, who gathered together a massive amount of knowledge from many contributors. Now this is up and open in our documentation website.
"},{"location":"assets/torw2020/presentation/#fmriprep-is-more-of-a-community-driven-project-every-day","title":"fMRIPrep is more of a community-driven project every day","text":"
Bug-fixes: we ensured that open feedback channels were attended (GitHub, NeuroStars, mailing list, etc.);
users began also proposing new features (some including code!);
with NiPreps we are working towards handling the project over to the community.
???
To ensure the future sustainability of the project (what some developers call Bus factor), we are transitioning the tool to NiPreps, transferring the large community nurtured over the past four years with it.
--
"},{"location":"assets/torw2020/presentation/#how-does-fmriprep-compensate-its-contributors","title":"How does fMRIPrep compensate its contributors?","text":"
Contributors are invited to coauthor relevant publications about fMRIPrep.
Anyone who helps with documentation, code or relevant discussions is a contributor.
.pull-left[
]
.pull-right[
]
???
In return, beyond the rewards of being part of an open source project, fMRIPrep gives some scientific credit back in the form of publications.
All contributors are invited to coauthor these publications.
Anything that helps the project is considered a sufficient contribution.
"},{"location":"assets/torw2020/presentation/#lessons-learned","title":"Lessons learned","text":""},{"location":"assets/torw2020/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/torw2020/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/torw2020/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/torw2020/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/torw2020/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/torw2020/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/torw2020/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/torw2020/presentation/#sdcflows-as-integrated-in-fmriprep","title":"SDCFlows, as integrated in fMRIPrep","text":"
REQUIRES (opts. 1 or 2): setting the IntendedFor metadata field of fieldmaps. ]
???
With SDCFlows, fMRIPrep implements a rather sophisticated pipeline for the estimation of susceptibility distortions.
Depending on whether the input dataset contains EPI images with opposed phase encoding polarities (the so-called PE-Polar correction), fieldmaps (as Gradient Recalled Echo sequences) or the fieldmap-less estimation is requested,
then SDCFlows establishes a hierarchy of corrections.
After correction, we are interested in assessing that low-frequency distortions have been accounted for and that high-frequency (with extreme regions suffering severe drop-outs) are not excessively present.
.pull-left[
] .pull-right[
]
???
sMRIPrep corresponds to the split of the anatomical preprocessing workflow originally proposed with fMRIPrep.
With the support of TemplateFlow, the tool now supports spatial normalization to one or more templates found in the TemplateFlow Archive.
It also supports the use of custom templates, whenever they are correctly installed in the templateflow's cache folder.
???
dMRIPrep and fMRIPrep are, of course the tip of the iceberg.
dMRIPrep is still in an alpha state, steadily progressing through the path fMRIPrep has delineated for NiPreps.
Hopefully, at this point of the talk fMRIPrep doesn't need further description.
template: newsection layout: false
.middle.center[
"},{"location":"assets/torw2020/presentation/#other-components-of-nipreps","title":"Other components of NiPreps","text":"
]
???
Some additional components of NiPreps were never part of fMRIPrep's codebase, or they have been started recently.
???
Such is the case of the quality control tools.
MRIQC produces visual reports for the efficient screening of acquired (meaning, unprocessed) data - in particular anatomical and functional MRI of the human brain.
CrowdMRI is an internet service where anonymized quality control metrics are uploaded automatically as they are computed by MRIQC.
The endgoal is to gather enough data to describe the normative distribution of these metrics across image parameters and scanning devices and sites.
Finally, MRIQCnets encloses several machine learning projects regarding the quality of acquired images.
"},{"location":"assets/torw2020/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/torw2020/presentation/#nibabies","title":"NiBabies","text":"
Recently started, covering infant MRI brain-extraction for now (Mathias Goncalves)
Recently started, covering rodent MRI brain-extraction for now (Eilidh MacNicol)
???
So, what's coming up next?
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
In a mid-term future, both NiBabies and NiRodents should allow the extension of fMRIPrep to these new two idiosyncratic data families.
In additions, plans for a molecular imaging or PET preprocessing NiPrep are being designed.
"},{"location":"assets/torw2020/presentation/#conclusion","title":"Conclusion","text":""},{"location":"assets/torw2020/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
"},{"location":"community/","title":"Join the NiPreps Community","text":"
One of the pillars of fMRIPrep, the seed project for NiPreps, has been nurturing an open-source community. Building Welcoming Communities is crucial for open-source software because of several reasons:
Engaging users and contributors (in a very liberal sense, not just with code) helps establish a development road-map:
In the case of fMRIPrep, many users have reported bugs via our issue tracker and Neurostars.org. Even though testing is one of the primary focuses for fMRIPrep, without these bug-report contributions the tool would have never reached the dependability level it requires to serve its purpose.
Users identify and propose new features, often illuminating shady areas the most involved developers did not find time or the right context to explore.
The community exposes the software and also increases the externality of the software. The neuroimaging discussion supported by Neurostars.org has been a key factor for the adoption of fMRIPrep.
Users always give back, and it is not uncommon to see elaborate responses to bug-reports and questions about fMRIPrep on Neurostars.org by users who had similar questions previously.
Because of the scientific purpose of NiPreps, there is one more fundamental reason to grow a (scientific) community around the tools: rigor/scrutiny. As one reviews a few of the most discussed pull-requests to fMRIPrep, very soon they realize that we don't just need to get the code right. We strive for integrating high-quality code, but even more importantly, that code must get the scientific method it implements right. This is particularly difficult because in most of the cases there aren't test oracles (in software engineering terms) or gold-standards (in scientific terms) to efficiently evaluate the validity of new features (even to exercise a minuscule area of the domain of inputs). The redundancy of expert eyes looking at our code has only helped make it better.
"},{"location":"community/#current-members-of-the-github-organization","title":"Current members of the GitHub organization","text":"
A total of 100 neuroimagers have already joined us. Becoming a member will give you access to additional forums for discussion, subscribing notifications for events and meetings, etc. You can request you are added to the organization by creating a new issue here.
"},{"location":"community/CODE_OF_CONDUCT/","title":"NiPreps Code of Conduct","text":""},{"location":"community/CODE_OF_CONDUCT/#our-pledge","title":"Our Pledge","text":"
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socioeconomic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting Oscar Esteban at oesteban@stanford.edu or Chris Markiewicz at markiewicz@stanford.edu, two members of the project team. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq
Welcome to the NiPreps project! We're excited you're here and want to contribute.
Imposter's syndrome disclaimer
Imposter's syndrome disclaimer1: We want your help. No, really.
There may be a little voice inside your head that is telling you that you're not ready to be an open-source contributor; that your skills aren't nearly good enough to contribute. What could you possibly offer a project like this one?
We assure you - the little voice in your head is wrong. If you can write code at all, you can contribute code to open-source. Contributing to open-source projects is a fantastic way to advance one's coding skills. Writing perfect code isn't the measure of a good developer (that would disqualify all of us!); it's trying to create something, making mistakes, and learning from those mistakes. That's how we all improve, and we are happy to help others learn.
Being an open-source contributor doesn't just mean writing code, either. You can help out by writing documentation, tests, or even giving feedback about the project (and yes - that includes giving feedback about the contribution process). Some of these contributions may be the most valuable to the project as a whole, because you're coming to the project with fresh eyes, so you can see the errors and assumptions that seasoned contributors have glossed over.
NiPreps are built around three overarching principles:
Robustness - The pipeline adapts the preprocessing steps depending on the input dataset and should provide results as good as possible independently of scanner make, scanning parameters or presence of additional correction scans (such as fieldmaps).
Ease of use - Thanks to dependence on the BIDS standard, manual parameter input is reduced to a minimum, allowing the pipeline to run in an automatic fashion.
\"Glass box\" philosophy - Automation should not mean that one should not visually inspect the results or understand the methods. Thus, NiPreps provides visual reports for each subject, detailing the accuracy of the most important processing steps. This, combined with the documentation, can help researchers to understand the process and decide which subjects should be kept for the group level analysis.
These principles distill some design and organizational foundations:
NiPreps only and fully support BIDS and BIDS-Derivatives for the input and output data.
NiPreps are packaged as a fully-compliant BIDS-Apps, not just in its user interface, but also in the continuous integration, testing, and delivery.
The scope of NiPreps is strictly limited to preprocessing tasks.
NiPreps are agnostic to subsequent analysis, i.e., any software supporting BIDS-Derivatives for its inputs should be amenable to analyze data preprocessed with them.
NiPreps are thoroughly and transparently documented (including the generation of individual, visual reports with a consistent format that serve as scaffolds for understanding the underpinnings and design decisions).
NiPreps are community-driven, and contributors (in any sense) always get credited with authorship within relevant publications.
NiPreps are modular, reliant on widely-used tools such as AFNI, ANTs, FreeSurfer, FSL, NiLearn, or DIPY [7-12] and extensible via plug-ins.
"},{"location":"community/CONTRIBUTING/#practical-guide-to-submitting-your-contribution","title":"Practical guide to submitting your contribution","text":"
These guidelines are designed to make it as easy as possible to get involved. If you have any questions that aren't discussed below, please let us know by opening an issue!
Before you start, you'll need to set up a free GitHub account and sign in. Here are some instructions.
Already know what you're looking for in this guide? Jump to the following sections:
Joining the conversation
Contributing through Github
Understanding issues
Making a change
Structuring contributions
Licensing
Recognizing contributors
"},{"location":"community/CONTRIBUTING/#joining-the-conversation","title":"Joining the conversation","text":"
NiPreps is maintained by a growing group of enthusiastic developers\u2014 and we're excited to have you join! Most of our discussions will take place on open issues.
We also encourage users to report any difficulties they encounter on NeuroStars, a community platform for discussing neuroimaging.
We actively monitor both spaces and look forward to hearing from you in either venue!
"},{"location":"community/CONTRIBUTING/#contributing-through-github","title":"Contributing through GitHub","text":"
git is a really useful tool for version control. GitHub sits on top of git and supports collaborative and distributed working.
If you're not yet familiar with git, there are lots of great resources to help you git started! Some of our favorites include the git Handbook and the Software Carpentry introduction to git.
On GitHub, You'll use Markdown to chat in issues and pull requests. You can think of Markdown as a few little symbols around your text that will allow GitHub to render the text with a little bit of formatting. For example, you could write words as bold (**bold**), or in italics (*italics*), or as a link ([link](https://youtu.be/dQw4w9WgXcQ)) to another webpage.
GitHub has a really helpful page for getting started with writing and formatting Markdown on GitHub.
Every project on GitHub uses issues slightly differently.
The following outlines how the NiPreps developers think about these tools.
Issues are individual pieces of work that need to be completed to move the project forward. A general guideline: if you find yourself tempted to write a great big issue that is difficult to describe as one unit of work, please consider splitting it into two or more issues.
Issues are assigned labels which explain how they relate to the overall project's goals and immediate next steps.
The current list of issue labels are here and include:
These issues contain a task that is amenable to new contributors because it doesn't entail a steep learning curve.
If you feel that you can contribute to one of these issues, we especially encourage you to do so!
These issues point to problems in the project.
If you find new a bug, please give as much detail as possible in your issue, including steps to recreate the error. If you experience the same bug as one already listed, please add any additional information that you have as a comment.
These issues are asking for new features and improvements to be considered by the project.
Please try to make sure that your requested feature is distinct from any others that have already been requested or implemented. If you find one that's similar but there are subtle differences, please reference the other request in your issue.
In order to define priorities and directions in the development roadmap, we have two sets of special labels:
Label Description Estimation of the downstream impact the proposed feature/bugfix will have. Estimation of effort required to implement the requested feature or fix the reported bug.
One way to understand these labels is to consider how they would apply to an imaginary issue. For example, if -- after a release -- a bug is identified that re-introduces a previously solved issue (i.e., its regresses the code outputs to some undesired behavior), we might assign it the following labels: . Its development priority would then be \"high\", since it is a low-effort, high-impact change.
Long-term goals may be labelled as a combination of: and or since they will have a high-impact on the code-base, but require a medium or high amount of effort. Of note, issues with the labels: or are less likely to be addressed because they are less likely to impact the code-base, or because they will require a very high activation energy to do so.
"},{"location":"community/CONTRIBUTING/#making-a-change","title":"Making a change","text":"
We appreciate all contributions to NiPreps, but those accepted fastest will follow a workflow similar to the following:
Comment on an existing issue or open a new issue referencing your addition. This allows other members of the NiPreps development team to confirm that you aren't overlapping with work that's currently underway and that everyone is on the same page with the goal of the work you're going to carry out. This blog is a nice explanation of why putting this work in up front is so useful to everyone involved.
Fork the particular NiPrep repository (e.g., fMRIPrep) with your GitHub user. This is now your own unique copy of that particular NiPreps component. Changes here won't affect anyone else's work, so it's a safe space to explore edits to the code!
Clone your forked NiPreps repository to your machine/computer. While you can edit files directly on github, sometimes the changes you want to make will be complex and you will want to use a text editor that you have installed on your local machine/computer. (One great text editor is vscode). In order to work on the code locally, you must clone your forked repository. To keep up with changes in the NiPreps repository, add the \"upstream\" NiPreps repository as a remote to your locally cloned repository.
Create a new branch to develop and maintain the proposed code changes. For example:
git fetch upstream # Always start with an updated upstream\ngit checkout -b fix/bug-1222 upstream/master\n
Please consider using appropriate branch names as those listed below, and mind that some of them are special (e.g., doc/ and docs/):
fix/<some-identifier>: for bugfixes
enh/<feature-name>: for new features
doc/<some-identifier>: for documentation improvements. You should name all your documentation branches with the prefix doc/ or docs/ as that will preempt triggering the full battery of continuous integration tests.
Make the changes you've discussed, following the NiPreps coding style guide. Try to keep the changes focused: it is generally easy to review changes that address one feature or bug at a time. It can also be helpful to test your changes locally, using a NiPreps development environment. Once you are satisfied with your local changes, add/commit/push them to the branch on your forked repository.
Submit a pull request. A member of the development team will review your changes to confirm that they can be merged into the main code base. Pull request titles should begin with a descriptive prefix (for example, ENH: Support for SB-reference in multi-band datasets):
ENH: enhancements or new features (example)
FIX: bug fixes (example)
TST: new or updated tests (example)
DOC: new or updated documentation (example)
STY: style changes (example)
REF: refactoring existing code (example)
CI: updates to continuous integration infrastructure (example)
MAINT: general maintenance (example)
For works-in-progress, add the WIP tag in addition to the descriptive prefix. Pull-requests tagged with WIP: will not be merged until the tag is removed.
Have your PR reviewed by the developers team, and update your changes accordingly in your branch. The reviewers will take special care in assisting you address their comments, as well as dealing with conflicts and other tricky situations that could emerge from distributed development.
Whenever possible, instances of Nipype Nodes and Workflows should use the same names as the variables they are assigned to. This makes it easier to relate the content of the working directory to the code that generated it when debugging.
Workflow variables should end in _wf to indicate that they refer to Workflows and not Nodes. For instance, a workflow whose basename is myworkflow might be defined as follows:
from nipype.pipeline import engine as pe\n\nmyworkflow_wf = pe.Workflow(name='myworkflow_wf')\n
If a workflow is generated by a function, the name of the function should take the form init_<basename>_wf:
If multiple instances of the same workflow might be instantiated in the same namespace, the workflow names and variables should include either a numeric identifier or a one-word description, such as:
We welcome and recognize all contributions regardless their size, content or scope: from documentation to testing and code development. You can see a list of current developers and contributors in our zenodo file. Before every release, a new zenodo file will be generated. The update script will also sort creators and contributors by the relative size of their contributions, as provided by the git-line-summary utility distributed with the git-extras package. Last positions in both the creators and contributors list will be reserved to the project leaders. These special positions can be revised to add names by punctual request and revised for removal and update of ordering in an scheduled manner every two years. All the authors enlisted as creators participate in the revision of modifications.
Anyone listed as a developer or a contributor can start the submission process of a manuscript as first author (please see Membership, where these concepts are described). To compose the author list, all the creators MUST be included (except for those people who opt to drop-out) and all the contributors MUST be invited to participate. First authorship(s) is (are) reserved for the authors that originated and kept the initiative of submission and wrote the manuscript. To generate the ordering of your paper, please run python .maint/paper_author_list.py from the root of the repository, on the up-to-date upstream/master branch. Then, please modify this list and place your name first. All developers and contributors are pulled together in a unique list, and last authorships assigned. NiPreps and its community adheres to open science principles, such that a pre-print should be posted on an adequate archive service (e.g., ArXiv or BioRxiv) prior publication.
NiPreps is licensed under the Apache 2.0 license. By contributing to NiPreps, you acknowledge that any contributions will be licensed under the same terms.
\u2014 Based on contributing guidelines from the STEMMRoleModels project.
The imposter syndrome disclaimer was originally written by Adrienne Lowe for a PyCon talk, and was adapted based on its use in the README file for the MetPy project.\u00a0\u21a9
The one bit that worries me is that fMRIPrep may become a Swiss army knife. I think instead it should just be a paring knife (small, efficient, and works for many things).
-- Satra (source)
When projects grow large, many forking paths created by newly implemented features start to open up. To account for this, the NiPreps community was created with the vision of building tools like fMRIPrep and MRIQC covering new imaging modalities, while keeping existing NiPreps tightly within scope. Defining such a scope also aids the implementation of the ease-of-use principle:
The same way the scanner does not offer an immense space of knobs to turn in the acquisition, NiPreps should not add many additional knobs to those for them to be considered a viable augmentation or extension of the scanner hw/sw.
-- Oscar (source)
"},{"location":"community/features/#the-problem-of-feature-creep","title":"The problem of feature creep","text":"
To avert feature creep and to serve each individual NiPrep, we developed the following guidelines, with the hopes of keeping these tools in a healthy state.
I'm worried fMRIPrep is catching a case of featuritis
-- Mathias (source)
These guidelines should also serve the community to transparently drive the process of including proposals into the road-map, set the ground for healthy conversation, and establish some patterns when accepting new-feature contributions. Before proposing new features, please be mindful that a road-map may not exist for a particular NiPrep. Even when a development road-map exists, please understand that it is not always possible to rigorously follow them:
I think something like this is what we tried to start sketching out with the development roadmap. The concern, as I remember it, was that we couldn't guarantee (or rule out) specific features when working with a small development team.
-- Elizabeth (source).
"},{"location":"community/features/#proposing-a-new-feature","title":"Proposing a new feature","text":""},{"location":"community/features/#why-the-new-feature-is-requested","title":"Why the new feature is requested?","text":"
Before going ahead and proposing a new feature, please take some time to learn whether the topic has been covered in the past and what decisions were made and why. This should be reasonably easy to do with the search tool of GitHub on the particular NiPrep repository.
If no previous discussion about the new idea is found, the next step is ensuring the new feature aligns with the vision and the scope of the target tool, as Elizabeth points out. Taking a look into the Development Road-map of the particular project (if it exists), may help finding an answer.
If the new feature still seems pertinent after this preliminary work or you are unsure about whether it falls within the scope, then go ahead and post an issue requesting feedback on your proposal. Please make sure to clearly state why the new feature should be considered.
"},{"location":"community/features/#some-questions-will-always-be-asked-about-a-new-feature","title":"Some questions will always be asked about a new feature","text":"
These questions by James will certainly help build up the discourse in support of the new feature, as the NiPreps maintainers will consider them:
Is the user interface affected? Because NiPreps generally expose a command-line interface (CLI) for the interaction with the user, new features involving changes to the CLI must be considered with caution as they may harm the ease-of-use:
It also seems that some new features add more confusion than others. Especially when the CLI is affected, and yet another option is added, that makes the tool more complex to use.
-- Alejandro (source).
Does the new feature substantially increase the internal complexity? Maintainers and developers will attempt to consolidate tools and lower the internal complexity whenever possible. This effort usually competes with the addition of new features as they typically will address particular use-cases rather than general improvements. However, that doesn't need to be the case, as some sections of the code might be objectively improvable and the integration of a new feature revising those might also lower complexity. Lowering the internal complexity will always be considered a great incentive for a new feature to be accepted.
Is there a standard procedure for the proposed feature in the literature?
if so, could we just use that procedure/value?
Is the feature dependent on some attribute of the input data? (e.g., TR, duration, etc.)
if so, can the procedure/value be determined algorithmically?
Does the feature interact with other settings? For instance, fmriprep#1962 interacts with the a/tCompCor implementation.
What is the difficulty of implementing the procedure outside of a NiPrep? In other words, does the NiPrep provide all the necessary outputs for a user to perform the non-standard analysis?
"},{"location":"community/features/#how-the-integration-of-the-new-feature-willcan-be-validated","title":"How the integration of the new feature will/can be validated?","text":"
Please propose ways to validate the new feature in the context of the workflow. Meaning, the objective here is to validate that the new feature works well within the pipeline, rather than validating a specific algorithm. To ensure the sustainability of NiPreps, the onus of this validation should be on the person/group requesting the feature.
"},{"location":"community/licensing/","title":"Licensing and Derived Works","text":"
The NiPreps community believes that software is an integral component of scientific practice, and that any scientific claim must be verifiable by following the chain of reasoning from observation to conclusion. To achieve this, software must be free to use, inspect, and critique. We also believe that you should be free to modify our software to improve it or adapt it to new use cases.
As software development is a dynamic process, code modifications can quickly become confusing as the original and modified versions depart from each other. For the sake of transparency and verification, when you modify our code, we ask that you document both the version of the software that you started with and the changes you make.
We believe these freedoms are best promoted by distributing our software under free/open source software licenses, and the license we feel best promotes these goals is the Apache License, Version 2.0.
This page outlines our commitment to transparent development and our expectations for developers who adapt NiPreps code to use in other projects.
"},{"location":"community/licensing/#licensing-of-nipreps-projects","title":"Licensing of NiPreps projects","text":"
All software packages and tools under the NiPreps umbrella must be licensed under the Apache License 2.0 by default, unless otherwise stated. The authors of new NiPreps packages may not abide by this general rule of thumb if necessary and/or sufficiently justified (e.g., the source code is actually derived from a product licensed under a copyleft license).
Containerized Images bundling NiPreps components and their dependencies can be distributed under a free and open-source license without copyleft, such as the MIT License. In such a case, the attribution notice of the MIT license must be present in the header comment of the container image bootstrapping file (for instance, the so-called Dockerfile). This different licensing must be also indicated in the NOTICE file of the corresponding NiPreps components bundled within the image.
Docker-wrappers such as the fmriprep-docker package may be licensed under any free and open-source license without copyleft, such as the MIT License. This different licensing must be also indicated in the NOTICE file of the corresponding NiPreps components bundled within the image.
Data (distributed within the test data of packages or through the nipreps-data GitHub organization) will preferably be distributed under the Creative Commons Zero v1.0 Universal.
Under no circumstances any NiPreps software or data will be made publicly available unlicensed. If you find any component of NiPreps that is unlicensed, please make us aware at nipreps@gmail.com at your earliest convenience.
(This section is adapted from this blog post by D. Mar\u00edn)
The Apache License was created by the Apache Software Foundation (ASF) as the license for its Apache HTTP Server.
Just as the MIT License, it\u2019s a very permissive non-copyleft license that allows using the software for any purpose, distributing it, modifying it, and distributing derived works of it without concern for royalties. Its main differences, compared to the MIT License, are:
Using the Apache License, the authors of the software grant patent licenses to any user or distributor of the code. This patent licenses apply to any patent that, being licenseable by any of the software author, would be infringed by the piece of code they have created.
Apache License required that unmodified parts in derived works keep the License.
In every licensed file, any original copyright, patent, trademark or attribution notices must be preserved.
In every licensed file change, there must be a notification stating that changes have been made in the file.
If the Apache-licensed software includes a NOTICE file, this file and its contents must be preserved in all the derived works.
If anyone intentionally sends a contribution for an Apache-licensed software to its authors, this contribution can automatically be used under the Apache License.
This license is interesting because of the automatic patent license, and the clause about contribution submission.
It\u2019s compatible with the GPL, so you can mix Apache licensed-code into GPL software.
In the case of scientific software, we believe that clearly stating that a Derived Work introduces changes into the original Work is a fundamental measure of transparency. Other than that, we wanted a permissive, non-copyleft license.
"},{"location":"community/licensing/#what-is-our-expectation-for-derived-works","title":"What is our expectation for Derived Works?","text":"
At the bare minimum, you must meet the conditions of the license (simplified version) about preserving the license text and copyright/attribution notices as well as corresponding statements of changes.
How to state that a file has been changed in a Derived Work. We suggest the following steps, heavily influenced by P. Ombredanne's recommendations at StackExchange:
In each source file, add a note to the header comment stating that the file has been modified, with an approximate date, and a high-level description of the changes. The date and the description of the changes are not strictly required, but they are positive etiquette from a software engineering standpoint and substantially improve the transparency of the changes from a scientific point of view.
If the source file did not have a license notice in the header comment, please add it to avoid ambiguity.
Deleted files: please keep the file with just the header comment and state that the file is deleted. The change statement should follow the suggestion in 1), preferably stating whether the source has been deleted or moved over to other files. If preserving the filename as-is might become confusing to the user of the Derived Work, the filename can be modified to be marked as hidden with a dot . or underscore _ prefix, or modifying the extension.
Preferably, also include a link to the original file in our GitHub repository, making sure the link is done to a particular commit state.
What changes would we like to see annotated? The high-level description of the changes will preferably contain:
Correction of bugs
Substantial performance improvement decisions
Replacement of relevant methods and dependencies by alternatives
Changes to the license
"},{"location":"community/licensing/#example-of-our-expectations","title":"Example of our expectations","text":"
Let's say a Derived Work modifies the sdcflows.viz.utils code-base. The file may or may not have the attribution notice. At the time of writing, the header comment of this file is:
Header comment in the original Work
With attribution noticeWithout attribution notice
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-\n# vi: set ft=python sts=4 ts=4 sw=4 et:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
Either way (whether the attribution notice is present or not), we suggest to update this header comment to something along the lines of the following:
Suggested header comment in the Derived Work
RequiredRecommended (commit)Recommended (version)
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found at:\n# https://github.com/nipreps/sdcflows/blob/50393a8584dd0abf5f8e16e6ba66c43e1126f844/sdcflows/viz/utils.py\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with yellow color are explicitly required by the Apache-2.0 conditions.
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found at:\n# https://github.com/nipreps/sdcflows/blob/50393a8584dd0abf5f8e16e6ba66c43e1126f844/sdcflows/viz/utils.py\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with green color are recommended by the NiPreps Developers.
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found within\n# the version 2.0.2 distribution of the software.\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with green color are recommended by the NiPreps Developers.
Although it is not mandated by the license letter, the spirit of the Apache-2.0 (and all other licenses stipulating the statement of changes, such as the CC-BY 4.0) suggests that a date of modification and an overview of outstanding changes are pertinent. We also suggest a link to the original code, including the commit-hash (that long string starting with 50393a in the URL above) for the location of the exact origin of the file. Alternatively, Derived Works may point to a exact release identifier where the original file is part of the code-base distribution. Please make sure to remove or replace with appropriate contents the comment tags <...> above.
What if a Derived Work does not modify this particular file? You should retain the original attribution notice as is (or introduce it if missing), unless you are relicensing the file. In that case, proceed with the suggestions above, and note the license change in the STATEMENT OF CHANGES block of the header comment.
"},{"location":"community/licensing/#are-papers-using-apache-20-licensed-software-considered-as-derived-works","title":"Are papers using Apache-2.0 licensed software considered as Derived Works?","text":"
No, they don't because they only reuse the software (in other words, they don't redistribute the software). The license stipulates that redistribution must retain the license and attribution notices as they are. In the scientific context, it is likely that a particular tool is modified (for example, to replace a method that you think is not appropriate for your data). Then, redistribution of the source would be desirable from the transparent reporting point of view, and therefore you should honor the License.
Generally, works using our NiPreps just need to follow the citation guidelines of the particular project and report the citation boilerplate including all software versions and literature references in the closest letter possible to that generated by the tool.
"},{"location":"community/licensing/#licensing-of-docker-and-singularity-images","title":"Licensing of Docker and Singularity images","text":"
Container images redistribute copies of NiPreps alongside their third-party dependencies, all of them bundled in the image. If the applicable license is Apache-2.0, then the text of a NOTICE file must be shown to the user. All NiPreps must insert a NOTICE file into their containerized distributions and print its contents out in the command line output, as well as in the visual reports. This NOTICE file for containers will be placed in the /.docker/NOTICE path of the repository, and this file must replace the /NOTICE file (if it exists) at image building time. Alternatively, and if the corresponding NiPreps Developers consider that the Apache-2.0 imposes too onerous requirements for the container image distribution, the source code of such images (e.g., Dockerfile) can be licensed under the MIT license.
Example NOTICE file for fMRIPrep
Python distribution /NOTICEContainer image distribution /.docker/NOTICE
fMRIPrep\nCopyright 2021 The NiPreps Developers.\n\nThis product includes software developed by\nthe NiPreps Community (https://nipreps.org/).\n\nPortions of this software were developed at the Department of\nPsychology at Stanford University, Stanford, CA, US.\n\nThis software contains code ultimately derived from the epidewarp.fsl\nscript (https://www.nmr.mgh.harvard.edu/~greve/fbirn/b0/epidewarp.fsl)\nby Doug Greve, Dave Tuch, Tom Liu, and Bryon Mueller with generous\nhelp from the FSL crew (www.fmrib.ox.ac.uk/fsl) and the Biomedical\nInformatics Research Network (www.nbirn.net).\n
fMRIPrep Container Image distribution\nCopyright 2021 The NiPreps Developers.\n\nThis product includes fMRIPrep and software developed by\nthe NiPreps Community (https://nipreps.org/).\n\nPortions of this software were developed at the Department of\nPsychology at Stanford University, Stanford, CA, US.\n\nThis product bundles AFNI <version-placeholder>, which is available under\nthe Gnu General Public License.\nMajor portions of AFNI were written at the Medical College of Wisconsin,\nwhich owns the copyright to that code. For fuller details, see\nhttp://afni.nimh.nih.gov/pub/dist/src/README.copyright.\n\nThis product bundles ANTs <version-placeholder>, which is available under\nthe BSD 3-clause license terms.\nCopyright 2009-2013 ConsortiumOfANTS.\n\nThis product bundles BIDS-Validator <version-placeholder>, which is available\nunder the MIT License.\nCopyright 2015 The Board of Trustees of the Leland Stanford Junior University.\n\nThis product bundles the Connectome Workbench <version-placeholder>, which\nis available under the GPL-v2\n(https://www.humanconnectome.org/software/connectome-workbench-license).\n\nThis product bundles FSL <version-placeholder>, which is available\nunder a custom license with commercial restrictions\n(https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/Licence).\nCopyright 2018, The University of Oxford.\n\nThis product bundles FreeSurfer <version-placeholder>, which is available\nunder a custom license and requires obtaining a license key\n(https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense).\nCopyright 2011, The General Hospital Corporation, Boston MA, USA.\n\nThis product bundles code derived from ICA-AROMA, both (fork and original work)\nare available under the Apache-2.0 license.\n(https://github.com/oesteban/ICA-AROMA/blob/master/license.md)\nCopyright 2021, Maarten Mennes\n\nThis product bundles Miniconda <version-placeholder>, which is available\nunder a BSD 3-clause license.\n(c) 2017 Continuum Analytics, Inc. (dba Anaconda, Inc.).\nhttps://www.anaconda.com. All Rights Reserved\n\nThis product bundles NeuroDebian, which adheres to the\nDebian Free Software Guidelines (DFSG)\nhttps://www.debian.org/social_contract#guidelines\nand the terms of the Debian Social Contract version 1.1.\n\nThis product bundles tools by the NiPy community, such as NiBabel\n(MIT License, https://github.com/nipy/nibabel/blob/master/COPYING),\nand NiPype (Apache-2.0, https://github.com/nipy/nipype/blob/master/LICENSE).\n\nThis product bundles Pandoc <version-placeholder>, which is available\nunder the GPL version 2 or later.\nCopyright (C) 2006-2021 John MacFarlane <jgm at berkeley dot edu>\n\nThis product bundles SVGO <version-placeholder>, which is available\nunder the MIT License.\nCopyright (c) Kir Belevich\n\nThis product bundles tedana <version-placeholder>, which is available under\nthe GNU Lesser General Public License v2.1.\nCopyright 2018, tedana developers.\n\nTemplateFlow, a component of this bundle, contains neuroimaging template\nand atlas data under several permissive licenses.\nPlease refer to the metadata of the particular template used in your study to\ndetermine the exact terms of the license and how to acknowledge attribution\nof those works.\n\nsMRIPrep, a component of this bundle, contains code ultimately derived from\nANTs <version-placeholder>, which is available under\nthe BSD 3-clause license terms.\nCopyright 2009-2013 ConsortiumOfANTS.\n\nsMRIPrep, a component of this bundle, contains code ultimately derived from\nMindboggle <version-placeholder>, which is available under\nthe Apache License 2.0.\nCopyright 2016, Mindboggle team (http://mindboggle.info)\n\nfMRIPrep contains code ultimately derived from the epidewarp.fsl\nscript (https://www.nmr.mgh.harvard.edu/~greve/fbirn/b0/epidewarp.fsl)\nby Doug Greve, Dave Tuch, Tom Liu, and Bryon Mueller with generous\nhelp from the FSL crew (www.fmrib.ox.ac.uk/fsl) and the Biomedical\nInformatics Research Network (www.nbirn.net).\n
In general, NiPreps embrace a liberal contribution model of governance structure. However, because of the scientific domain of NiPreps, the community features some structure from meritocracy models to prescribe the order in the authors list of new papers about these tools.
Developers are members of a wonderful team driving the project. Names and contacts of all developers are included in the .maint/developers.json file of each project. Examples of steering activities that drive the project are: actively participating in the follow-up meetings, leading documentation sprints, helping in the design of the tool and definition of the roadmap, providing resources (in the broad sense, including funding), code-review, etc.
Contributors enlisted in the .maint/contributors.json file of each project actively help or have previously helped the project in a broad sense: writing code, writing documentation, benchmarking modules of the tool, proposing new features, helping improve the scientific rigor of implementations, giving out support on the different communication channels (mattermost, NeuroStars, GitHub, etc.). If you are new to the project, don't forget to add your name and affiliation to the list of contributors there! Our Welcome Bot will send an automated message reminding this to first-time contributors. Before every release, unlisted contributors will be invited again to add their names to the file (just in case they missed the automated message from our Welcome Bot).
Contributors who have contributed at some point to the project but were required or they wished to disconnect from the project's updates and to drop-out from publications and other dissemination activities, are listed in the .maint/former.json file.
This document explains how to prepare a new development environment and update an existing environment, as necessary, for the development of NiPreps' components. Some components may deviate from these guidelines, in such a case, please follow the guidelines provided in their documentation.
If you plan to contribute back to the community, making your code available via pull-request, please make sure to have read and understood the Community Documents and Contributor Guidelines. If you plan to distribute derived code, please follow our licensing guidelines.
Development in Docker is encouraged, for the sake of consistency and portability. By default, work should be built off of nipreps/fmriprep:unstable, which tracks the master branch, or nipreps/fmriprep:latest, which tracks the latest release version (see BIDS-Apps execution guide for the basic procedure for running).
It will be assumed the developer has a working repository in $HOME/projects/fmriprep, and examples are also given for niworkflows and NiPype.
"},{"location":"devs/devenv/#patching-a-working-copy-into-a-docker-container","title":"Patching a working copy into a Docker container","text":"
In order to test new code without rebuilding the Docker image, it is possible to mount working repositories as source directories within the container. The Docker wrapper script simplifies this for the most common repositories:
-f PATH, --patch-fmriprep PATH\n working fmriprep repository (default: None)\n -n PATH, --patch-niworkflows PATH\n working niworkflows repository (default: None)\n -p PATH, --patch-nipype PATH\n working nipype repository (default: None)\n
For instance, if your repositories are contained in $HOME/projects:
New dependencies to be inserted into the Docker image will either be Python or non-Python dependencies. Python dependencies may be added in three places, depending on whether the package is large or non-release versions are required. The image must be rebuilt after any dependency changes.
Python dependencies should generally be included in the appropriate dependency metadata of the setup.cfg file found at the root of each repository. If some the dependency must be a particular version (or set thereof), it is possible to use version filters in this setup.cfg file.
For large Python dependencies where there will be a benefit to pre-compiled binaries, conda packages may also be added to the conda install line in the Dockerfile.
Non-Python dependencies must also be installed in the Dockerfile, via a RUN command. For example, installing an apt package may be done as follows:
RUN apt-get update && \\\n apt-get install -y <PACKAGE>\n
If it is necessary to (re)build the Docker image, a local image named fmriprep may be built from within the local repository. Let's assume it is located in ~/projects/fmriprep:
The VERSION build argument is necessary to ensure that help text can be reliably generated. The get_version.py tool constructs the version string from the current repository state.
To work in this image, replace nipreps/fmriprep:latest with just fmriprep in any of the above commands. This image may be accessed by the Docker wrapper via the -i flag, e.g.:
$ fmriprep-docker -i fmriprep --shell\n
"},{"location":"devs/devenv/#code-server-development-environment-experimental","title":"Code-Server Development Environment (Experimental)","text":"
To get the best of working with containers and having an interactive development environment, we have an experimental setup with code-server.
Important
We have a video walking through the process if you want a visual guide.
1. Build the Docker image. We will use the Dockerfile_devel file to build our development docker image:
$ cd $HOME/projects/fmriprep\n$ docker build -t fmriprep_devel -f Dockerfile_devel .\n
2. Run the Docker image We can start a docker container using the image we built (fmriprep_devel):
$ docker run -it -p 127.0.0.1:8445:8080 -v ${PWD}:/src/fmriprep fmriprep_devel:latest\n
Windows Users
If you are using windows shell, ${PWD} may not be defined, instead use the absolute path to your repository.
Docker-Toolbox
If you are using Docker-Toolbox, you will need to change your virtualbox settings using these steps as a guide. For step 6, instead of Name = rstudio; Host Port = 8787; Guest Port = 8787, have Name = code-server; Host Port = 8443; Guest Port = 8080. Then in the docker command above, change 127.0.0.1:8445:8080 to 192.168.99.100:8445:8080.
If the container started correctly, you should see the following on your console:
INFO Server listening on http://localhost:8080\nINFO - No authentication\nINFO - Not serving HTTPS\n
Now you can switch to your favorite browser and go to: 127.0.0.1:8445 (or 192.168.99.100:8445 for Docker Toolbox).
3. Copy fmriprep.egg-info into your fmriprep/ project directory fmriprep.egg-info makes the package executable inside the docker container. Open a terminal in vscode and type the following:
$ cp -R /src/fmriprep.egg-info /src/fmriprep/\n
"},{"location":"devs/devenv/#code-server-development-environment-features","title":"Code-Server Development Environment Features","text":"
The editor is vscode
There are several preconfigured debugging tests under the debugging icon in the activity bar
see vscode debugging python for details.
The gitlens and python extensions are preinstalled to improve the development experience in vscode.
As of January 2020, fMRIPrep has adopted a Calendar Versioning scheme, and with it we are attempting to apply more coherent semantic rules to our releases.
Note
This document is a draft for internal and external comment. Any commitments expressed here are proposals, and should not be relied upon at this time. This conversation started as a Google Doc.
The basic release form is YY.MINOR.PATCH, so the first minor release of 2020 is 20.0.0, and the first minor release of 2021 will be 21.0.0, whatever the final minor release of 2020 is. A series of releases share a YY.MINOR. prefix, which we refer to as the YY.MINOR.x series. For example, the 20.0.x series contains version 20.0.0, 20.0.1, and any other releases needed.
Minor releases are considered feature releases. Because there is no concept of a \"major\" release (just a calendar year rollover), most changes to the code base will result in a new feature release. Changes targeting a new feature release should target the master branch. Feature releases may be released as often as is deemed appropriate.
Patch releases are considered bug-fix releases. Each minor release triggers the creation of a new maint/<YY>.<MINOR>.x branch, and changes targeting a bug-fix release should target this branch. A \"minor release series\" is the initial feature release and the bug-fix releases that share the minor release prefix. Bug-fix releases may be released on minimal notice to other developers.
These releases must satisfy four conditions:
Resolving one or more bugs. These mostly include failures of fMRIPrep to complete or producing invalid derivatives (e.g., a NIfTI file of all zeroes).
Derivatives compatibility. If a subject may be successfully run on 20.0.n, then the imaging derivatives should be identical if rerun with 20.0.(n+1), modulo rounding errors and the effects of nondeterministic algorithms. The changes between successful runs of 20.0.n and 20.0.(n+1) should not be larger than the changes between two successful runs of 20.0.n. Cosmetic changes to reports are acceptable, while differing fields of view or data types in a NIfTI file would not be.
API compatibility. Workflow-generating functions, workflow inputnode and outputnode fields must not change. As an end-user application, this may seem overly strict, but the odds of introducing a bug are much higher in these cases.
User interface compatibility. Substantial changes to fMRIPrep command line must not happen (e.g., the addition of a new, relevant flag).
Note that not all bugs can be fixed in a way that satisfies all four of these criteria without significant effort. A developer may determine that the bug will be fixed in the next feature release.
Additional acceptable changes within a minor release series:
Improved tests. These often come along with bug fixes, but they can be free-standing improvements to the code base.
Improved documentation. Unless the documentation is of a feature that will not be present in a bug-fix release, this is always welcome.
Updates to the Dockerfile that improve operation for Docker and/or Singularity users, but do not risk behavior change. A good example is including more templates to reduce the need for network requests. An example of an update to the Dockerfile that forces a minor release increment is a change in the pinned version of any of the dependencies or the base container image.
Improvements to the lightweight wrappers. As long as a command-line invocation that worked for the previous version continues to work and produce the same Docker command, there's little chance of harm.
It is expected that maint/20.0.x will diverge from master, as new features will be merged into master, and bug-fixes into maint/20.0.x. At a minimum, each new bug-fix release should be merged into master. After a 20.0.1 release:
fMRIPrep has a number of dependencies that we control at this point:
sMRIPrep
SDCflows
NiWorkflows
These do not follow the same versioning scheme as above, but we need them to follow a compatible scheme. In particular, we need to be able to fix bugs that are situated within these dependencies in a bug-fix release without violating the criteria laid out above. At the time of an fMRIPrep feature release, all of the above tools need to also split out a maintenance branch (if they have not already) for the minor version series that fMRIPrep depends on. As an example, when 20.0.0 was released, fMRIPrep had the following dependencies in setup.cfg:
~= is the compatible release specifier described in PEP 440. ~= 1.1.7 is equivalent to >= 1.1.7, == 1.1.*. This means that the current version of fMRIPrep is expected to work with niworkflows 1.1.7+ but not 1.2+. Thus, niworkflows needs to have a maint/1.1.x branch, sdcflows a maint/1.2.x and smriprep maint/0.5.x. Any changes to these tools that might violate API or derivative compatibility, must go into master, and must not be released into the current minor series of these tools. Note that fMRIPrep 20.0.0 does not depend on niworkflows ~= 1.1.0. Multiple feature releases of fMRIPrep may depend on the same minor release series of a dependency. There is no requirement to hike the dependency. However, if a dependency has started a new minor release series, a feature release of fMRIPrep is a good opportunity to bump the dependency.
We maintain a Versions Matrix to document and keep track of these dependencies.
A minor release series will continue to accept qualifying bug fixes at least until the next minor release. A minimum duration may be considered, or a fixed number of minor release series might be simultaneously supported.
An unmaintained series is a valid target for bug fixes after the support window, but the expected effort level of the contributor and maintainers will be higher and lower, respectively.
"},{"location":"devs/releases/#long-term-support-series","title":"Long-term support series","text":"
A long-term support (LTS) series is a minor release series that an LTS manager commits to maintaining for a specific duration, no less than one year. LTS series are under the same constraints as a minor release series in terms of what changes can be accepted.
The fMRIPrep developers commit to maintaining one LTS series at all times, at intervals of approximately one year. Community members may volunteer to assume maintainership after the initial period, or to maintain another minor release series as LTS.
Support windows of greater than a year have a much higher potential to run into issues with upstream dependencies going outside of their support windows. As much as possible, an fMRIPrep minor release should seek to move to the versions of upstream dependencies that will ensure the longest support before being considered for LTS.
Additional tasks required of an LTS manager:
Tracking possible breaking changes and broken URLs in upstream projects outside of the nipreps ecosystem.
If a bug is identified as existing within the LTS series and can be fixed without breaking API or derivative compatibility.
As many dependencies as possible should be pinned to specific versions relevant to the environment they are installed in. Packages (Debian .deb files, conda packages, Python wheels) should be archived in case of a loss of the external packages.
sMRIPrep requires niworkflows and generally must depend on one minor series of niworkflows for the duration of an sMRIPrep minor series. Each sMRIPrep series may also be depended on for an fMRIPrep series and/or a dMRIPrep series. Noting these dependencies here should make it easier to track when a new minor series needs to be created.
"},{"location":"intro/nipreps/","title":"Framework","text":""},{"location":"intro/nipreps/#building-on-fmripreps-success-story","title":"Building on fMRIPrep's success story","text":"
The current neuroimaging workflow has matured into a large chain of processing and analysis steps involving a large number of experts, across imaging modalities and applications. The development and fast adoption of fMRIPrep have revealed that neuroscientists need tools that simplify their research workflow, provide visual reports and checkpoints, and engender trust in the tool itself. The NiPreps framework extends fMRIPrep's approach and principles to new imaging modalities. The vision for NiPreps is to provide end-users (i.e., researchers) with applications that allow them to perform quality control smoothly and to prepare their data for modeling and statistical analysis.
NiPreps leverage the Brain Imaging Data Structure (BIDS) to understand all the particular features and available metadata (i.e., imaging parameters) of the input dataset. BIDS allows NiPreps to automatically stage the most adequate preprocessing workflow while minimizing manual intervention.
The NiPreps framework (Figure 1) encompasses a wide array of software projects organized into three layers of scientific software:
Software infrastructure: including quite mature projects such as NiPype and NiBabel; the standard specifications of the Brain Imaging Data Structure (BIDS, and BIDS-Derivatives); and some other tools such as NiTransforms or TemplateFlow, under development. These tools deliver low-level interfaces (e.g., data access to images and spatial transforms) and utilities (see Figure 1).
Middleware: these are utilities that generalize their functionalities across the end-user tools. These utilities cover foundational processing methodologies (e.g., NiWorkflows and SDCflows), the crowdsourcing of metadata (e.g., MRIQC Web-API), and the support for deep learning models (MRIQC-nets).
End-user tools such as fMRIPrep: Some existing end-user tools include sMRIPrep (Structural MRI Preprocessing), which lies in between an end-user tool and middleware, as it is involved in higher-level tools such as fMRIPrep. Finally, quality control tools (e.g., MRIQC) to be executed before any preprocessing happens.
NiRodents (GitHub): middleware adaptations for small animals imaging.
NiBabies (GitHub): middleware adaptations for infant imaging.
"},{"location":"intro/transparency/","title":"Transparency of workflows","text":"
NiPreps adopt fMRIPrep's foundations, and particularly resonate with the transparency principles. As discussed in (Esteban et al., 2019 -- preprint):
The rapid increase in the volume and diversity of data, as well as the evolution of available techniques for processing and analysis, presents an opportunity for considerable advancement of research in neuroscience. The drawback resides in the need for progressively more complex analysis workflows that rely on decreasingly interpretable models of the data. Such context encourages \u2018black-box\u2019 solutions that efficiently perform a valuable service but do not provide insights into how the tool has transformed the data into the expected outputs. Black boxes obscure important steps in the inductive process mediating between experimental measurements and reported findings. This way of moving forward risks producing a future generation of cognitive neuroscientists who have become experts in sophisticated computational methods but have little to no working knowledge of how their data were transformed through processing. Transparency is often identified as a remedy for these problems. fMRIPrep ascribes to \u2018glass-box\u2019 principles, which are defined in opposition to the many different facets or levels at which black-box solutions are opaque. The visual reports that fMRIPrep generates are a crucial aspect of the glass-box approach. Their quality control checkpoints represent the logical flow of preprocessing, allowing scientists to critically inspect and better understand the underlying mechanisms of the workflow. A second transparency element is the citation boilerplate that formalizes all details of the workflow and provides the versions of all involved tools along with references to the corresponding scientific literature. A third asset for transparency is thorough documentation that delivers additional details on each of the building blocks represented in the visual reports and described in the boilerplate. Further, fMRIPrep has been open-source since its inception: users have access to all of the incremental additions to the tool through the history of the version-control system. The use of GitHub grants access to the discussions held during development, allowing one to see how and why the main design decisions were made. The modular design of fMRIPrep enhances its flexibility and improves transparency, as the main features of the software are more easily accessible to potential collaborators. In combination with some coding style and contribution guidelines, this modularity has enabled multiple contributions by peers and the creation of a rapidly growing community that would be difficult to nurture behind closed doors.
One foundational component of the NiPreps framework is the Visual Report System. End-user applications such as fMRIPrep or dMRIPrep generate individual reports after their preprocessing. Those visual reports have two fundamental purposes:
assessing the quality of the generated outputs, permitting the user to take quality control actions to eliminate biases originated from inadequate processing; and
understanding the workflow, by sequentially presenting the main steps of processing, the user can access the why the tool in particular took these steps ando more geneally why standard preprocessing involves that step.
NiPreps leverage the wealth of existing neuroimaging software that is available to researchers. To give back for standing on the shoulders of giants, NiPreps aim at the most thorough reporting possible crediting all the pieces of the prior knowledge they leverage. With the execution of some particular NiPreps, the application runs some introspection code to formalize the computational graph the particular workflow executed and iterates over all the nodes to extract the relevant articles and communications that should be cited, as well as all software tools and their versions involved. Similarly, ancillary materials such as neuroimaging templates and atlases are reported and cited.
All these references and citations are finally collated in a natural language description of the workflow. This description is therefore generated automatically, and contains all the details that are necessary to replicate the processing, as well as the abovementioned references. The text is appended to the visual report, and provided in three formats (markdown, latex and html/plain-text) with an index of citations, so that the user is only required to \"copy-and-paste\" into the Methods section of their papers.
Note for reviewers and editors
The boilerplate text generated by some NiPreps is intended to allow for clear, consistent description of the preprocessing steps used, in order to improve the reproducibility of studies. We fully intend for it to be copied verbatim, and have released it under the CC0 license, dedicating it to the public domain in jurisdictions that recognize the concept, and assert that we will take no action to enforce copyright in jurisdictions where we cannot disclaim it.
We firmly believe that requiring authors to modify this passage will serve no legitimate scientific or literary purpose and can, in fact, serve only to reduce the replicability of the analysis being described by making the preprocessing steps less clear.
We recognize that there may be automated plagiarism detection software that will flag the boilerplate text. We would be happy to discuss potential solutions for annotating boilerplate sections of documents to indicate automatic generation, and can update our software to make this annotation simpler for authors.
"},{"location":"news/","title":"News and Announcements","text":""},{"location":"news/#register-for-the-nipreps-hackathon-with-the-ohbm23-brainhack","title":"Register for the NiPreps hackathon with the OHBM'23 Brainhack!","text":"
We are thrilled to announce that the NiPreps Hackathon's second edition will be part of the upcoming OHBM'23 Brainhack (July 19-21, Maison Notman House, Montreal, Canada).
Registration To join us for this incredible event and work on NiPreps-related projects, please fill in our registration form.
Please remember to also register on the official webpage of the OHBM Brainhack. You will find all the necessary information, event schedule, and location details on Brainhack's website.
Approach and projects We will advance (online) some projects as much as possible before the BrainHack. We are putting together a list of potential projects at https://github.com/orgs/nipreps/projects/8. Please feel free to let us know your ideas and voice your questions. Projects can start at any moment (even at the venue in Montreal) to have the flexibility to accommodate all ideas.
Those projects with preliminary work will have project leaders who will organize meetings, coordinate a roadmap and help carry out the necessary tasks.
See you in Montreal!
"},{"location":"news/#nipreps-roundups-feb-22-2023","title":"NiPreps Roundups Feb 22, 2023","text":"
We resumed the bi-monthly NiPreps Roundups with a first meeting on February 22, 2023.
Educational Talk at OHBM 2023 - Quality Control in fMRI studies with MRIQC and fMRIPrep
"},{"location":"users/talks/","title":"Talks and presentations","text":"
NiPreps @ BrainHack Seoul 2024
Standardizing neuroimaging workflows (Journal Club @ EPFL 2023)
Presentation about MRIQC for INCF 2022 (10 min)
NiPreps introduction, Educational Session at OHBM 2022
Building community workflows, BrainHack Donostia 2020
Building communities around reproducible workflows, Open Reproducible Neuroscience workshop 2020
Reproducible workflows, Think Open Rovereto Workshop 2020
"}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"NeuroImaging PREProcessing toolS (NiPreps)","text":"
NiPreps augment the scanner to produce data directly consumable by analyses.
We refer to data directly consumable by analyses as analysis-grade data by analogy with the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are products that have been:
minimally preprocessed, but are
safe to consume directly.
"},{"location":"#building-on-the-success-story-of-fmriprep","title":"Building on the success story of fMRIPrep","text":"
NiPreps were conceived as a generalization of fMRIPrep across new modalities, populations, cohorts, and species. fMRIPrep is widely adopted, as our telemetry with Sentry (and now, in-house with migas) shows:
fMRIPrep is executed an average of 9,500 times every week, of which, around 7,000 times it finishes successfully (72.9% success rate). The average number of executions started includes debug and dry runs where researchers do not intend actually process data. Therefore, the effective (that is, discarding test runs) success ratio of fMRIPrep is likely higher."},{"location":"apps/datalad/","title":"Git-Annex and DataLad within containers","text":"
Apps may be able to identify if the input dataset is handled with DataLad or Git-Annex, and pull down linked data that has not been fetched yet. One example of one such application is MRIQC, and all the examples on this documentation page will refer to it.
Summary
Executing BIDS-Apps leveraging DataLad-controlled datasets within containers can be tricky. In particular, one of our general recommendations involves mounting or binding folders into the container in read-only mode, which will disallow DataLad from writing to the dataset tree. Similarly, and depending on the specific runtime settings of the container framework, DataLad may encounter issues with file ownership too. This section guides users through ensuring smooth execution of BIDS-Apps on DataLad/Git-annex-managed datasets.
"},{"location":"apps/datalad/#datalad-and-docker","title":"DataLad and Docker","text":"
When executing MRIQC within Docker on a DataLad dataset (for instance, installed from OpenNeuro), we will need to ensure the following settings are observed:
the user id (uid) who installed the DataLad dataset must match the uid who is executing MRIQC within the container runtime
the uid who is executing MRIQC within the container must have sufficient permissions to write in the tree.
If the uid is not correct, we will likely encounter the following error:
datalad.runner.exception.CommandError: CommandError: 'git -c diff.ignoreSubmodules=none -c core.quotepath=false -c annex.merge-annex-branches=false annex find --not --in . --json --json-error-messages -c annex.dotfiles=true -- sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz sub-0002/func/sub-0002_task-emomatching_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-restingstate_acq-mb3_bold.nii.gz sub-0001/func/sub-0001_task-emomatching_acq-seq_bold.nii.gz sub-0001/func/sub-0001_task-faces_acq-mb3_bold.nii.gz sub-0001/dwi/sub-0001_dwi.nii.gz sub-0002/func/sub-0002_task-workingmemory_acq-seq_bold.nii.gz sub-0001/anat/sub-0001_T1w.nii.gz sub-0002/anat/sub-0002_T1w.nii.gz sub-0001/func/sub-0001_task-gstroop_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-faces_acq-mb3_bold.nii.gz sub-0002/func/sub-0002_task-anticipation_acq-seq_bold.nii.gz sub-0002/dwi/sub-0002_dwi.nii.gz sub-0001/func/sub-0001_task-anticipation_acq-seq_bold.nii.gz sub-0001/func/sub-0001_task-workingmemory_acq-seq_bold.nii.gz sub-0002/func/sub-0002_task-gstroop_acq-seq_bold.nii.gz' failed with exitcode 1 under /data [info keys: stdout_json] [err: 'git-annex: Git refuses to operate in this repository, probably because it is owned by someone else.\n\nTo add an exception for this directory, call:\ngit config --global --add safe.directory /data\n\ngit-annex: automatic initialization failed due to above problems']\n
Confusingly, following the suggestion from DataLad directly on the host (git config --global --add safe.directory /data) will not work in this case, because this line must be executed within the container.
Instead, we can override the default user executing within the container (which is root, or uid = 0). This can be achieved with Docker's -u/--user option:
We can combine this option with Bash's id command to ensure the current user's uid and group id (gid) are being set. Let's update the last example in the previous Docker execution section:
The above command line will ensure MRIQC to be executed with the current uid and gid, which will match the filesystem's permissions if the dataset was installed with the same user.
Match uid and gid with those corresponding to the user who installed the dataset
When different users are to install the dataset and execute the application, Docker must be executed with the uid and gid corresponding to the user who installed the dataset. The uid corresponding to a given username (for instance janedoe) can be obtained as follows:
getent passwd \"janedoe\" | cut -f 3 -d \":\"\n
and her gid:
getent passwd \"janedoe\" | cut -f 4 -d \":\"\n
"},{"location":"apps/datalad/#mounting-the-dataset-folder-without-read-only-permissions","title":"Mounting the dataset folder without read-only permissions","text":"
If the dataset is protected with read-only permissions, then MRIQC will hit the following error (see nipreps/mriqc#1363):
get(error): sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz (file) [git-annex: .git/annex/tmp: createDirectory: permission denied (Read-only file system)]\naction summary:\n get (error: 1)\nTraceback (most recent call last):\n File \"/opt/conda/bin/mriqc\", line 8, in <module>\n sys.exit(main())\n ^^^^^^\n File \"/opt/conda/lib/python3.11/site-packages/mriqc/cli/run.py\", line 43, in main\n parse_args(argv)\n File \"/opt/conda/lib/python3.11/site-packages/mriqc/cli/parser.py\", line 658, in parse_args\n initialize_meta_and_data()\n File \"/opt/conda/lib/python3.11/site-packages/mriqc/utils/misc.py\", line 447, in initialize_meta_and_data\n _datalad_get(dataset)\n File \"/opt/conda/lib/python3.11/site-packages/mriqc/utils/misc.py\", line 282, in _datalad_get\n return get(\n ^^^^\n File \"/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py\", line 773, in eval_func\n return return_func(*args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py\", line 763, in return_func\n results = list(results)\n ^^^^^^^^^^^^^\n File \"/opt/conda/lib/python3.11/site-packages/datalad_next/patches/interface_utils.py\", line 287, in _execute_command_\n raise IncompleteResultsError(\ndatalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:\n[{'action': 'get',\n 'annexkey': 'MD5E-s76037251--344f061a3165c71e36b98ad1649c3c8c.nii.gz',\n 'error_message': 'git-annex: .git/annex/tmp: createDirectory: permission '\n 'denied (Read-only file system)',\n 'path': '/data/sub-0001/func/sub-0001_task-restingstate_acq-mb3_bold.nii.gz',\n 'refds': '/data',\n 'status': 'error',\n 'type': 'file'}]\n
This error indicates that the container is executed with the appropriate uid and gid pair. In this case, we will need to ensure DataLad can write to the dataset installation when obtaining new data. This is easily achieved by removing the read-only parameters of the mount option:
$ docker run -ti --rm \\\n -v $HOME/ds002785:/data \\ # mount data WITHOUT :ro\n -v $HOME/ds002785/derivatives:/out \\\n -v $HOME/tmp/ds002785-workdir:/work \\\n -u $(id -u):$(id -g) \\ # set execution uid:gid\n nipreps/mriqc:<latest-version> \\\n \\\n /data /out/mriqc-<latest-version> \\\n participant \\\n -w /work\n
"},{"location":"apps/datalad/#datalad-and-singularityapptainer","title":"DataLad and Singularity/Apptainer","text":"
In the case of Singularity and Apptainer, ensuring the uid that executes the container involves using user namespace mappings. Therefore, you will need to contact your system administrator to figure out a convenient solution to the problem.
Since most of Singularity/Apptainer deployments automatically bind the user's $HOME directory, DataLad's suggested direction may work:
Allowing the container to write on the dataset's tree is straightforward and homologous to Docker, by removing the :ro setting in the binding option (-B).
"},{"location":"apps/docker/","title":"Executing with Docker","text":"
Summary
Here, we describe how to run NiPreps with Docker containers. To illustrate the process, we will show the execution of fMRIPrep, but these guidelines extend to any other end-user NiPrep.
"},{"location":"apps/docker/#before-you-start-install-docker","title":"Before you start: install Docker","text":"
Probably, the most popular framework to execute containers is Docker. If you are to run a NiPrep on your PC/laptop, this is the RECOMMENDED way of execution. Please make sure you follow the Docker installation instructions. You can check your Docker Runtime installation running their hello-world image:
$ docker run --rm hello-world\n
If you have a functional installation, then you should obtain the following output:
Hello from Docker!\nThis message shows that your installation appears to be working correctly.\n\nTo generate this message, Docker took the following steps:\n 1. The Docker client contacted the Docker daemon.\n 2. The Docker daemon pulled the \"hello-world\" image from the Docker Hub.\n (amd64)\n 3. The Docker daemon created a new container from that image which runs the\n executable that produces the output you are currently reading.\n 4. The Docker daemon streamed that output to the Docker client, which sent it\n to your terminal.\n\nTo try something more ambitious, you can run an Ubuntu container with:\n $ docker run -it ubuntu bash\n\nShare images, automate workflows, and more with a free Docker ID:\n https://hub.docker.com/\n\nFor more examples and ideas, visit:\n https://docs.docker.com/get-started/\n
After checking your Docker Engine is capable of running Docker images, you are ready to pull your first NiPreps container image.
For every new version of the particular NiPrep app that is released, a corresponding Docker image is generated. The Docker image becomes a container when the execution engine loads the image and adds an extra layer that makes it runnable. In order to run NiPreps Docker images, the Docker Runtime must be installed.
Taking fMRIPrep to illustrate the usage, first you might want to make sure of the exact version of the tool to be used:
$ docker pull nipreps/fmriprep:<latest-version>\n
You can run NiPreps interacting directly with the Docker Engine via the docker run interface.
"},{"location":"apps/docker/#running-a-niprep-with-a-lightweight-wrapper","title":"Running a NiPrep with a lightweight wrapper","text":"
Some NiPreps include a lightweight wrapper script for convenience. That is the case of fMRIPrep and its fmriprep-docker wrapper. Before starting, make sure you have the wrapper installed. When you run fmriprep-docker, it will generate a Docker command line for you, print it out for reporting purposes, and then execute it without further action needed, e.g.:
fmriprep-docker implements the unified command-line interface of BIDS Apps, and automatically translates directories into Docker mount points for you.
We have published a step-by-step tutorial illustrating how to run fmriprep-docker. This tutorial also provides valuable troubleshooting insights and advice on what to do after fMRIPrep has run.
"},{"location":"apps/docker/#running-a-niprep-directly-interacting-with-the-docker-engine","title":"Running a NiPrep directly interacting with the Docker Engine","text":"
If you need a finer control over the container execution, or you feel comfortable with the Docker Engine, avoiding the extra software layer of the wrapper might be a good decision.
Accessing filesystems in the host within the container: Containers are confined in a sandbox, so they can't access the host in any ways unless you explicitly prescribe acceptable accesses to the host. The Docker Engine provides mounting filesystems into the container with the -v argument and the following syntax: -v some/path/in/host:/absolute/path/within/container:ro, where the trailing :ro specifies that the mount is read-only. The mount permissions modifiers can be omitted, which means the mount will have read-write permissions. In general, you'll want to at least provide two mount-points: one set in read-only mode for the input data and one read/write to store the outputs. Potentially, you'll want to provide one or two more mount-points: one for the working directory, in case you need to debug some issue or reuse pre-cached results; and a TemplateFlow folder to preempt the download of your favorite templates in every run.
Running containers as a user: By default, Docker will run the container as root. Some share systems my limit this feature and only allow running containers as a user. When the container is run as root, files written out to filesystems mounted from the host will have the user id 1000 by default. In other words, you'll need to be able to run as root in the host to change permissions or manage these files. Alternatively, running as a user allows preempting these permissions issues. It is possible to run as a user with the -u argument. In general, we will want to use the same user ID as the running user in the host to ensure the ownership of files written during the container execution. Therefore, you will generally run the container with -u $( id -u ).
Once the Docker Engine arguments are written, the remainder of the command line follows the usage. In other words, the first section of the command line is all equivalent to the fmriprep executable in a bare-metal installation: :
$ docker run -ti --rm \\ # These lines\n -v $HOME/ds005:/data:ro \\ # are equivalent to\n -v $HOME/ds005/derivatives:/out \\ # a call to the App's\n -v $HOME/tmp/ds005-workdir:/work \\ # entry-point.\n nipreps/fmriprep:<latest-version> \\ #\n \\\n /data /out/fmriprep-<latest-version> \\ # These lines correspond\n participant \\ # to the particular BIDS\n -w /work # App arguments.\n
"},{"location":"apps/framework/","title":"Introduction","text":""},{"location":"apps/framework/#what-is-bids","title":"What is BIDS?","text":"
The Brain Imaging Data Structure (BIDS) is a standard for organizing and describing brain datasets, including MRI. The common naming convention and folder structure allow researchers to easily reuse BIDS datasets, re-apply analysis protocols, and run standardized automatic data preprocessing pipelines (and particularly, BIDS Apps). The BIDS starter-kit contains a wide collection of educational resources. Validity of the structure can be assessed with the online BIDS-Validator. The tree of a typical, valid (BIDS-compliant) dataset is shown below:
"},{"location":"apps/framework/#what-is-a-bids-app","title":"What is a BIDS App?","text":"
(Taken from the BIDS Apps paper)
A BIDS App is a container image capturing a neuroimaging pipeline that takes a BIDS-formatted dataset as input. Since the input is a whole dataset, apps are able to combine multiple modalities, sessions, and/or subjects, but at the same time need to implement ways to query input datasets. Each BIDS App has the same core set of command-line arguments, making them easy to run and integrate into automated platforms. BIDS Apps are constructed in a way that does not depend on any software outside of the container image other than the container engine.
BIDS Apps rely upon two technologies for container computing:
Docker \u2014 for building, hosting as well as running containers on local hardware (running Windows, Mac OS X or Linux) or in the cloud.
Singularity \u2014 for running containers on HPCs (high-performance computing).
BIDS Apps are deposited in the Docker Hub repository, making them openly accessible. Each app is versioned and all of the historical versions are available to download. By reporting the BIDS App name and version in a manuscript, authors can provide others with the ability to exactly replicate their analysis workflow.
Docker is used for its excellent documentation, maturity, and the Docker Hub service for storage and distribution of the images. Docker containers are easily run on personal computers and cloud services. However, the Docker Runtime was originally designed to run different components of web services (HTTP servers, databases etc.) using cloud resources. Docker thus requires root or root-like permissions, as well as modern versions of Linux kernel (to perform user mapping and management of network resources); though this is not a problem in context of renting cloud resources (which are not shared with other users), it makes it difficult or impossible to use in a multi-tenant environment such as an HPC system, which is often the most cost-effective computational resource available to researchers.
Singularity, on the other hand, is a unique container technology designed from the ground up with the encapsulation of binary dependencies and HPC use in mind. Its main advantage over Docker is that it does not require root access for container execution and thus is safe to use on multi-tenant systems. In addition, it does not require recent Linux kernel functionalities (such as namespaces, cgroups and capabilities), making it easy to install on legacy systems.
BIDS Apps decouple the individual level analysis (processing of independent subjects) from group-level analyses aggregating participants. For the analysis of individual subjects, Apps need to understand the BIDS structure of the input dataset, so that the required inputs for the designated subject are found. Apps are designed to easily process derivatives generated by the participant-level or other Apps. The overall workflow has an entry-point and an end-point responsible of setting-up the map-reduce tasks and the tear-down including organizing the outputs for its archiving, respectively. Each App may implement multiple map and reduce steps.
To improve user experience and ability to integrate BIDS Apps into various computational platforms, each App follows a set of core command-line arguments:
In this case, we have selected to run the participant level (to process individual subjects). fMRIPrep does not have a group level, but other BIDS Apps may have. For instance, MRIQC generates group-level reports with the following command-line:
"},{"location":"apps/framework/#what-are-bids-derivatives","title":"What are BIDS Derivatives?","text":"
NiPreps generate derivatives of the original data, and they fulfill the BIDS specification for the results of Apps that are created for subsequent consumption by other BIDS-Apps. These derivatives must follow the BIDS Derivatives specification (draft). An example of BIDS Derivatives filesystem tree, generated with fMRIPrep 1.5:
"},{"location":"apps/singularity/","title":"Executing with Singularity","text":"
Summary
Here, we describe how to run NiPreps with Singularity containers. To illustrate the process, we will show the execution of fMRIPrep, but these guidelines extend to any other end-user NiPrep.
"},{"location":"apps/singularity/#preparing-a-singularity-image","title":"Preparing a Singularity image","text":"
Singularity version >= 2.5: If the version of Singularity installed on your HPC (High-Performance Computing) system is modern enough you can create Singularity image directly on the system. This is as simple as:
where <version> should be replaced with the desired version of fMRIPrep that you want to download.
Singularity version < 2.5: In this case, start with a machine (e.g., your personal computer) with Docker installed. Use docker2singularity to create a singularity image. You will need an active internet connection and some time:
Singularity by default exposes all environment variables from the host inside the container. Because of this, your host libraries (e.g., NiPype or a Python environment) could be accidentally used instead of the ones inside the container. To avoid such a situation, we strongly recommend using the --cleanenv argument in all scenarios. For example:
Alternatively, conflicts might be preempted and some problems mitigated by unsetting potentially problematic settings, such as the PYTHONPATH variable, before running:
It is possible to define environment variables scoped within the container by using the SINGULARITYENV_* magic, in combination with --cleanenv. For example, we can set the FreeSurfer license variable (see fMRIPrep's documentation on this) as follows: :
As we can see, the export in the first line tells Singularity to set a corresponding environment variable of the same name after dropping the prefix SINGULARITYENV_.
"},{"location":"apps/singularity/#accessing-the-hosts-filesystem","title":"Accessing the host's filesystem","text":"
Depending on how Singularity is configured on your cluster it might or might not automatically bind (mount or expose) host's folders to the container (e.g., /scratch, or $HOME). This is particularly relevant because, if you can't run Singularity in privileged mode (which is almost certainly true in all the scenarios), Singularity containers are read only. This is to say that you won't be able to write anything unless Singularity can access the host's filesystem in write mode.
By default, Singularity automatically binds (mounts) the user's home directory and a scratch directory. In addition, Singularity generally allows binding the necessary folders with the -B <host_folder>:<container_folder>[:<permissions>] Singularity argument. For example:
If your Singularity installation doesn't allow you to bind non-existent bind points, you'll get an error saying WARNING: Skipping user bind, non existent bind point (directory) in container. In this scenario, you can either try to bind things onto some other bind point you know it exists in the image or rebuild your singularity image with docker2singularity as follows:
In the example above, the following bind points are created: /gpfs, /scratch, /work, /share, /opt/templateflow.
Important
One great feature of containers is their confinement or isolation from the host system. Binding mount points breaks this principle, as the container has now access to create changes in the host. Therefore, it is generally recommended to use binding scarcely and granting very limited access to the minimum necessary resources. In other words, it is preferred to bind just one subdirectory of $HOME than the full $HOME directory of the host (see nipreps/fmriprep#1778 (comment)).
Relevant aspects of the $HOME directory within the container: By default, Singularity will bind the user's $HOME directory in the host into the /home/$USER (or equivalent) in the container. Most of the times, it will also redefine the $HOME environment variable and update it to point to the corresponding mount point in /home/$USER. However, these defaults can be overwritten in your system. It is recommended to check your settings with your system's administrators. If your Singularity installation allows it, you can workaround the $HOME specification combining the bind mounts argument (-B) with the home overwrite argument (--home) as follows:
"},{"location":"apps/singularity/#templateflow-and-singularity","title":"TemplateFlow and Singularity","text":"
TemplateFlow is a helper tool that allows neuroimaging workflows to programmatically access a repository of standard neuroimaging templates. In other words, TemplateFlow allows NiPreps to dynamically change the templates that are used, e.g., in the atlas-based brain extraction step or spatial normalization.
Default settings in the Singularity image should get along with the Singularity installation of your system. However, deviations from the default configurations of your installation may break this compatibility. A particularly problematic case arises when the home directory is mounted in the container, but the $HOME environment variable is not correspondingly updated. Typically, you will experience errors like OSError: [Errno 30] Read-only file system or FileNotFoundError: [Errno 2] No such file or directory: '/home/fmriprep/.cache'.
If it is not explicitly forbidden in your installation, the first attempt to overcome this issue is manually setting the $HOME directory as follows:
$ singularity run --home $HOME --cleanenv fmriprep.simg <fmriprep arguments>\n
If the user's home directory is not automatically bound, then the second step would include manually binding it as in the section above: :
Finally, if the --home argument cannot be used, you'll need to provide the container with writable filesystems where TemplateFlow's files can be downloaded. In addition, you will need to indicate fMRIPrep to update the default paths with the new mount points setting the SINGULARITYENV_TEMPLATEFLOW_HOME variable. :
# Tell the NiPrep where TemplateFlow will place downloads\n$ export SINGULARITYENV_TEMPLATEFLOW_HOME=/opt/templateflow\n$ singularity run -B <writable-path-on-host>:/opt/templateflow \\\n --cleanenv fmriprep.simg <fmriprep arguments>\n
"},{"location":"apps/singularity/#restricted-internet-access","title":"Restricted Internet access","text":"
We have identified several conditions in which running NiPreps might fail because of spotty or impossible access to Internet.
If your compute node cannot have access to Internet, then you'll need to pull down from TemplateFlow all the resources that will be necessary ahead of run-time.
If that is not the case (i.e., you should be able to hit HTTP/s endpoints), then you can try the following:
VerifiedHTTPSConnection ... Failed to establish a new connection: [Errno 110] Connection timed out. If you encounter an error like this, probably you'll need to set up an http proxy exporting SINGULARITYENV_http_proxy (see nipreps/fmriprep#1778 (comment). For example:
$ export SINGULARITYENV_https_proxy=http://<ip or proxy name>:<port>\n
requests.exceptions.SSLError: HTTPSConnectionPool .... In this case, your container seems to be able to reach the Internet, but unable to use SSL encryption. There are two potential solutions to the issue. The recommended one is setting REQUESTS_CA_BUNDLE to the appropriate path, and/or binding the appropriate filesystem:
Setting up a functional execution framework with Singularity might be tricky in some HPC (high-performance computing) systems. Please make sure you have read the relevant documentation of Singularity, and checked all the defaults and configuration in your system. The next step is checking the environment and access to fMRIPrep resources, using singularity shell.
Check access to input data folder, and BIDS validity:
$ singularity shell -B path/to/data:/data fmriprep.simg\nSingularity fmriprep.simg:~> ls /data\nCHANGES README dataset_description.json participants.tsv sub-01 sub-02 sub-03 sub-04 sub-05 sub-06 sub-07 sub-08 sub-09 sub-10 sub-11 sub-12 sub-13 sub-14 sub-15 sub-16 task-balloonanalogrisktask_bold.json\nSingularity fmriprep.simg:~> bids-validator /data\n 1: [WARN] You should define 'SliceTiming' for this file. If you don't provide this information slice time correction will not be possible. (code: 13 - SLICE_TIMING_NOT_DEFINED)\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-01/func/sub-01_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-02/func/sub-02_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-01_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-02_bold.nii.gz\n ./sub-03/func/sub-03_task-balloonanalogrisktask_run-03_bold.nii.gz\n ./sub-04/func/sub-04_task-balloonanalogrisktask_run-01_bold.nii.gz\n ... and 38 more files having this issue (Use --verbose to see them all).\n Please visit https://neurostars.org/search?q=SLICE_TIMING_NOT_DEFINED for existing conversations about this issue.\n
Check access to output data folder, and whether you have write permissions
"},{"location":"apps/singularity/#running-singularity-on-a-slurm-system","title":"Running Singularity on a SLURM system","text":"
An example of sbatch script to run fMRIPrep on a SLURM system1 is given below. The submission script will generate one task per subject using a job array.
#!/bin/bash\n#\n#SBATCH -J fmriprep\n#SBATCH --time=48:00:00\n#SBATCH -n 1\n#SBATCH --cpus-per-task=16\n#SBATCH --mem-per-cpu=4G\n#SBATCH -p normal,mygroup # Queue names you can submit to\n# Outputs ----------------------------------\n#SBATCH -o log/%x-%A-%a.out\n#SBATCH -e log/%x-%A-%a.err\n#SBATCH --mail-user=%u@domain.tld\n#SBATCH --mail-type=ALL\n# ------------------------------------------\n\nBIDS_DIR=\"$STUDY/data\"\nDERIVS_DIR=\"derivatives/fmriprep-20.2.2\"\nLOCAL_FREESURFER_DIR=\"$STUDY/data/derivatives/freesurfer-6.0.1\"\n\n# Prepare some writeable bind-mount points.\nTEMPLATEFLOW_HOST_HOME=$HOME/.cache/templateflow\nFMRIPREP_HOST_CACHE=$HOME/.cache/fmriprep\nmkdir -p ${TEMPLATEFLOW_HOST_HOME}\nmkdir -p ${FMRIPREP_HOST_CACHE}\n\n# Prepare derivatives folder\nmkdir -p ${BIDS_DIR}/${DERIVS_DIR}\n\n# Make sure FS_LICENSE is defined in the container.\nexport SINGULARITYENV_FS_LICENSE=$HOME/.freesurfer.txt\n\n# Designate a templateflow bind-mount point\nexport SINGULARITYENV_TEMPLATEFLOW_HOME=\"/templateflow\"\nSINGULARITY_CMD=\"singularity run --cleanenv -B $BIDS_DIR:/data -B ${TEMPLATEFLOW_HOST_HOME}:${SINGULARITYENV_TEMPLATEFLOW_HOME} -B $L_SCRATCH:/work -B ${LOCAL_FREESURFER_DIR}:/fsdir $STUDY/images/fmriprep_20.2.2.simg\"\n\n# Parse the participants.tsv file and extract one subject ID from the line corresponding to this SLURM task.\nsubject=$( sed -n -E \"$((${SLURM_ARRAY_TASK_ID} + 1))s/sub-(\\S*)\\>.*/\\1/gp\" ${BIDS_DIR}/participants.tsv )\n\n# Remove IsRunning files from FreeSurfer\nfind ${LOCAL_FREESURFER_DIR}/sub-$subject/ -name \"*IsRunning*\" -type f -delete\n\n# Compose the command line\ncmd=\"${SINGULARITY_CMD} /data /data/${DERIVS_DIR} participant --participant-label $subject -w /work/ -vv --omp-nthreads 8 --nthreads 12 --mem_mb 30000 --output-spaces MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5 --use-aroma --fs-subjects-dir /fsdir\"\n\n# Setup done, run the command\necho Running task ${SLURM_ARRAY_TASK_ID}\necho Commandline: $cmd\neval $cmd\nexitcode=$?\n\n# Output results to a table\necho \"sub-$subject ${SLURM_ARRAY_TASK_ID} $exitcode\" \\\n >> ${SLURM_JOB_NAME}.${SLURM_ARRAY_JOB_ID}.tsv\necho Finished tasks ${SLURM_ARRAY_TASK_ID} with exit code $exitcode\nexit $exitcode\n
"},{"location":"assets/ORN-Workshop/presentation/#building-communities-around-reproducible-workflows","title":"Building communities around reproducible workflows","text":""},{"location":"assets/ORN-Workshop/presentation/#o-esteban","title":"O. Esteban","text":""},{"location":"assets/ORN-Workshop/presentation/#chuv-lausanne-university-hospital","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/ORN-Workshop/presentation/#wwwniprepsorg","title":"www.nipreps.org","text":"
]
layout: false count: false
.middle.center[
"},{"location":"assets/ORN-Workshop/presentation/#building-communities-around-reproducible-workflows_1","title":"Building communities around reproducible workflows","text":""},{"location":"assets/ORN-Workshop/presentation/#o-esteban_1","title":"O. Esteban","text":""},{"location":"assets/ORN-Workshop/presentation/#chuv-lausanne-university-hospital_1","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/ORN-Workshop/presentation/#wwwniprepsorg_1","title":"www.nipreps.org","text":"
]
???
"},{"location":"assets/ORN-Workshop/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
although many neuroimaging areas are still in search of methodological breakthroughs,
challenges have moved on to the workflows:
workflows within traditional toolboxes - usually not flexible to adapt to new data
BIDS and BIDS-Apps.
???
researchers have a large portfolio of image processing components readily available
toolboxes with great support and active maintenance:
"},{"location":"assets/ORN-Workshop/presentation/#new-questions-changing-the-focus","title":"New questions changing the focus:","text":""},{"location":"assets/ORN-Workshop/presentation/#-validity-does-the-workflow-actually-work-out","title":"- validity (does the workflow actually work out?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-transparency-is-it-a-black-box-how-precise-is-reporting","title":"- transparency (is it a black-box? how precise is reporting?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-vibration-how-each-tool-choice-parameters-affect-overall","title":"- vibration (how each tool choice & parameters affect overall?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-throughput-how-much-datatime-can-it-possible-take","title":"- throughput (how much data/time can it possible take?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-robustness-can-i-use-it-on-diverse-studies","title":"- robustness (can I use it on diverse studies?)","text":""},{"location":"assets/ORN-Workshop/presentation/#-evaluation-what-is-it-unique-about-the-workflow-wrt-existing-alternatives","title":"- evaluation (what is it unique about the workflow, w.r.t. existing alternatives?)","text":""},{"location":"assets/ORN-Workshop/presentation/#the-garden-of-forking-paths","title":"The garden of forking paths","text":"
(Botvinik-Nezer et al., 2020)
Around 50% of teams used fMRIPrep'ed inputs.
"},{"location":"assets/ORN-Workshop/presentation/#the-fmriprep-story","title":"The fMRIPrep story","text":""},{"location":"assets/ORN-Workshop/presentation/#fmriprep-produces-analysis-ready-data-from-diverse-data","title":"fMRIPrep produces analysis-ready data from diverse data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
robust against inhomogeneity of data across studies
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/ORN-Workshop/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/ORN-Workshop/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
"},{"location":"assets/ORN-Workshop/presentation/#fmriprep-was-not-originally-envisioned-as-a-community-project","title":"fMRIPrep was not originally envisioned as a community project ...","text":"
(we just wanted a robust tool to automatically preprocess incoming data of OpenNeuro.org)
--
"},{"location":"assets/ORN-Workshop/presentation/#but-a-community-built-up-quickly-around-it","title":"... but a community built up quickly around it","text":"
Preprocessing of fMRI was in need for division of labor.
Obsession with transparency made early-adopters confident of the recipes they were applying.
Responsiveness to feedback. ]
.pull-right[
]
???
Preprocessing is a time-consuming effort, requires expertise converging imaging foundations & CS, typically addressed with legacy in-house pipelines.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
"},{"location":"assets/ORN-Workshop/presentation/#key-aspect-credit-all-direct-contributors","title":"Key aspect: credit all direct contributors","text":"
--
"},{"location":"assets/ORN-Workshop/presentation/#and-indirect-citation-boilerplate","title":".. and indirect: citation boilerplate.","text":""},{"location":"assets/ORN-Workshop/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/ORN-Workshop/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/ORN-Workshop/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/ORN-Workshop/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
"},{"location":"assets/ORN-Workshop/presentation/#image-processing-possible-guidelines-for-the-standardization-clinical-applications-j-veraart","title":"Image Processing: Possible Guidelines for the Standardization & Clinical Applications (J. Veraart)","text":"
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/ORN-Workshop/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/ORN-Workshop/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/ORN-Workshop/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/ORN-Workshop/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/ORN-Workshop/presentation/#nibabies-fmriprep-babies","title":"NiBabies | fMRIPrep-babies","text":"
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
"},{"location":"assets/ORN-Workshop/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
NMiND = NeverMIND, this Neuroimaging Method Is Not Duplicated
"},{"location":"assets/ORN-Workshop/presentation/#pis-worried-about-methodological-duplicity","title":"PIs worried about methodological duplicity","text":"
M. Milham, D. Fair, T. Satterthwaite, S. Ghosh, R. Poldrack, etc.
"},{"location":"assets/bhd2020/presentation/#nipreps-neuroimaging-preprocessing-tools","title":"NiPreps | NeuroImaging PREProcessing toolS","text":""},{"location":"assets/bhd2020/presentation/#o-esteban","title":"O. Esteban","text":""},{"location":"assets/bhd2020/presentation/#chuv-lausanne-university-hospital","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorgassetsbhd2020","title":"www.nipreps.org/assets/bhd2020","text":"
]
layout: false count: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#nipreps-neuroimaging-preprocessing-tools_1","title":"NiPreps | NeuroImaging PREProcessing toolS","text":""},{"location":"assets/bhd2020/presentation/#o-esteban_1","title":"O. Esteban","text":""},{"location":"assets/bhd2020/presentation/#chuv-lausanne-university-hospital_1","title":"CHUV | Lausanne University Hospital","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorgassetsbhd2020_1","title":"www.nipreps.org/assets/bhd2020","text":"
]
???
"},{"location":"assets/bhd2020/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
"},{"location":"assets/bhd2020/presentation/#outlook","title":"Outlook","text":""},{"location":"assets/bhd2020/presentation/#1-understand-what-preprocessing-is-from-fmri","title":"1. Understand what preprocessing is - from fMRI","text":""},{"location":"assets/bhd2020/presentation/#2-the-fmriprep-experience","title":"2. The fMRIPrep experience","text":""},{"location":"assets/bhd2020/presentation/#3-the-dmriprep-experience","title":"3. The dMRIPrep experience","text":""},{"location":"assets/bhd2020/presentation/#4-importance-of-the-visual-reports","title":"4. Importance of the visual reports","text":""},{"location":"assets/bhd2020/presentation/#5-introducing-nipreps","title":"5. Introducing NiPreps","text":""},{"location":"assets/bhd2020/presentation/#6-open-forum-first-steps-and-contributing","title":"6. Open forum: first steps and contributing","text":""},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-nowadays","title":"The research workflow of functional MRI (nowadays)","text":"
(source: next slide)
"},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-2006","title":"The research workflow of functional MRI (2006)","text":"
(Strother, 2006; 10.1109/MEMB.2006.1607667)
"},{"location":"assets/bhd2020/presentation/#the-research-workflow-of-functional-mri-ab","title":"The research workflow of functional MRI (a.B.*)","text":"
Adapted (Strother, 2006)
*a.B. = after BIDS (Brain Imaging Data Structure; Gorgolewski et al. (2016))
"},{"location":"assets/bhd2020/presentation/#neuroimaging-is-now-mature","title":"Neuroimaging is now mature","text":"
many excellent tools available (from specialized to foundational)
large toolboxes (AFNI, ANTs/ITK, FreeSurfer, FSL, Nilearn, SPM, etc.)
"},{"location":"assets/bhd2020/presentation/#bids-a-thrust-of-technology-driven-development","title":"BIDS - A thrust of technology-driven development","text":"
A uniform and complete interface to data:
Uniform: enables the workflow adapt to the data
Complete: enables validation and minimizes human-intervention
Extensible reproducibility:
BIDS-Derivatives
BIDS-Apps (Gorgolewski et al., 2017)
???
researchers have a large portfolio of image processing components readily available
toolboxes with great support and active maintenance:
"},{"location":"assets/bhd2020/presentation/#new-questions-changing-the-focus","title":"New questions changing the focus:","text":""},{"location":"assets/bhd2020/presentation/#-validity-does-the-workflow-actually-work-out","title":"- validity (does the workflow actually work out?)","text":""},{"location":"assets/bhd2020/presentation/#-transparency-is-it-a-black-box-how-precise-is-reporting","title":"- transparency (is it a black-box? how precise is reporting?)","text":""},{"location":"assets/bhd2020/presentation/#-vibration-how-each-tool-choice-parameters-affect-overall","title":"- vibration (how each tool choice & parameters affect overall?)","text":""},{"location":"assets/bhd2020/presentation/#-throughput-how-much-datatime-can-it-possible-take","title":"- throughput (how much data/time can it possible take?)","text":""},{"location":"assets/bhd2020/presentation/#-robustness-can-i-use-it-on-diverse-studies","title":"- robustness (can I use it on diverse studies?)","text":""},{"location":"assets/bhd2020/presentation/#-evaluation-what-is-it-unique-about-the-workflow-wrt-existing-alternatives","title":"- evaluation (what is it unique about the workflow, w.r.t. existing alternatives?)","text":""},{"location":"assets/bhd2020/presentation/#the-garden-of-forking-paths","title":"The garden of forking paths","text":"
(Botvinik-Nezer et al., 2020)
Around 50% of teams used fMRIPrep'ed inputs.
"},{"location":"assets/bhd2020/presentation/#the-fmriprep-story","title":"The fMRIPrep story","text":""},{"location":"assets/bhd2020/presentation/#fmriprep-produces-analysis-ready-data-from-diverse-data","title":"fMRIPrep produces analysis-ready data from diverse data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
robust against inhomogeneity of data across studies
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/bhd2020/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/bhd2020/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
"},{"location":"assets/bhd2020/presentation/#fmriprep-was-not-originally-envisioned-as-a-community-project","title":"fMRIPrep was not originally envisioned as a community project ...","text":"
(we just wanted a robust tool to automatically preprocess incoming data of OpenNeuro.org)
--
"},{"location":"assets/bhd2020/presentation/#but-a-community-built-up-quickly-around-it","title":"... but a community built up quickly around it","text":"
Preprocessing of fMRI was in need for division of labor.
Obsession with transparency made early-adopters confident of the recipes they were applying.
Responsiveness to feedback. ]
.pull-right[
]
???
Preprocessing is a time-consuming effort, requires expertise converging imaging foundations & CS, typically addressed with legacy in-house pipelines.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
"},{"location":"assets/bhd2020/presentation/#key-aspect-credit-all-direct-contributors","title":"Key aspect: credit all direct contributors","text":"
--
"},{"location":"assets/bhd2020/presentation/#and-indirect-citation-boilerplate","title":".. and indirect: citation boilerplate.","text":""},{"location":"assets/bhd2020/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/bhd2020/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/bhd2020/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/bhd2020/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
Joseph, M.; Pisner, D.; Richie-Halford, A.; Lerma-Usabiaga, G.; Keshavan, A.; Kent, JD.; Veraart, J.; Cieslak, M.; Poldrack, RA.; Rokem, A.; Esteban, O.
template: newsection layout: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#understanding-what-preprocessing-is-with-visual-reports","title":"Understanding what preprocessing is with visual reports","text":"
Let's walk through one example of report. Reports have several sections, starting with a summary indicating the particularities of this dataset and workflow choices made based on the input data.
The anatomical section follows with several visualizations to assess the anatomical processing steps mentioned before, spatial normalization to template spaces (the flickering panel helps assess alignment) and finally surface reconstruction.
Then, all functional runs are concatenated, and all show the same structure. After an initial summary of this particular run, the alignment to the same subject's anatomical image is presented, with contours of the white and pial surfaces as cues. Next panel shows the brain mask and ROIs utilized by the CompCor denoising. For each run we then find some visualizations to assess the generated confounding signals.
After all functional runs are presented, the About section keeps information to aid reproducibility of results, such as the software's version, or the exact command line run.
The boilerplate is found next, with a text version shown by default and tabs to convert to Markdown and LaTeX.
Reports conclude with a list of encountered errors (if any).
"},{"location":"assets/bhd2020/presentation/#reports-are-a-crucial-element-to-ensure-transparency","title":"Reports are a crucial element to ensure transparency","text":"
.pull-left[
]
.pull-right[
.distribute[ fMRIPrep generates one participant-wide report after execution.
Reports describe the data as found, and the steps applied (providing .blue[visual support to look inside the box]):
show researchers their data;
show how fMRIPrep interpreted the data (describing the actual preprocessing steps);
quality control of results, facilitating early error detection. ] ]
???
Therefore, reports have become a fundamental feature of fMRIPrep because they not only allow assessing the quality of the processing, but also provide an insight about the logic supporting such processing.
In other words, reports help respond to the what was done and the why was it done in addition to the how well it did.
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/bhd2020/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/bhd2020/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/bhd2020/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/bhd2020/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/bhd2020/presentation/#nibabies-fmriprep-babies","title":"NiBabies | fMRIPrep-babies","text":"
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
"},{"location":"assets/bhd2020/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
template: newsection layout: false
.middle.center[
"},{"location":"assets/bhd2020/presentation/#where-to-start","title":"Where to start?","text":""},{"location":"assets/bhd2020/presentation/#wwwniprepsorg_1","title":"www.nipreps.org","text":""},{"location":"assets/bhd2020/presentation/#githubcomnipreps","title":"github.com/nipreps","text":"
"},{"location":"assets/torw2020/presentation/#im-going-to-talk-about-how-we-are-building-a-framework-of-preprocessing-pipelines-for-neuroimaging-called-nipreps-based-on-the-fmriprep-experience","title":"I'm going to talk about how we are building a framework of preprocessing pipelines for neuroimaging called NiPreps, based on the fMRIPrep experience.","text":"
"},{"location":"assets/torw2020/presentation/#fmriprep-produces-analysis-ready-data-from-acquired-fmri-data","title":"fMRIPrep produces analysis-ready data from acquired (fMRI) data","text":"
minimal requirements (BIDS-compliant);
agnostic to downstream steps of the workflow
produces BIDS-Derivatives;
???
fMRIPrep takes in a task-based or resting-state functional MRI dataset in BIDS-format and returns preprocessed data ready for analysis.
Preprocessed data can be used for a broad range of analysis, and they are formatted following BIDS-Derivatives to maximize compatibility with: * major software packages (AFNI, FSL, SPM*, etc.) * further temporal filtering and denoising: fMRIDenoise * any BIDS-Derivatives compliant tool (e.g., FitLins).
--
"},{"location":"assets/torw2020/presentation/#fmriprep-is-a-bids-app-gorgolewski-et-al-2017","title":"fMRIPrep is a BIDS-App (Gorgolewski, et al. 2017)","text":"
adhered to modern software-engineering standards (CI/CD, containers)
compatible interface with other BIDS-Apps
optimized for automatic execution
???
fMRIPrep adopts the BIDS-App specifications. That means the software is tested with every change to the codebase, it also means that packaging, containerization, and deployment are also automated and require tests to be passing. BIDS-Apps are inter-operable (via BIDS-Derivatives), and optimized for execution in HPC, Cloud, etc.
--
"},{"location":"assets/torw2020/presentation/#minimizes-human-intervention","title":"Minimizes human intervention","text":"
avoid error-prone parameters settings (read them from BIDS)
adapts the workflow to the actual data available
while remaining flexible to some design choices (e.g., whether or not reconstructing surfaces or customizing target normalized standard spaces)
???
fMRIPrep minimizes human intervention because the user does not need to fiddle with any parameters - they are obtained from the BIDS structure. However, fMRIPrep does allow some flexibility to ensure the preprocessing meets the requirements of the intended analyses.
--
"},{"location":"assets/torw2020/presentation/#fmriprep-bundles-many-tools-afni-fsl-freesurfer-nilearn-etc","title":"fMRIPrep bundles many tools (AFNI, FSL, FreeSurfer, Nilearn, etc.)","text":"
(do not reinvent the wheel)
???
Finally, fMRIPrep sits on top of giants' shoulders: AFNI, FSL, FreeSurfer, Nilearn, etc. all implement methods very well backed-up and are thoroughly tested on their own.
"},{"location":"assets/torw2020/presentation/#we-started-fmriprep-in-february-2016","title":"We started fMRIPrep in February 2016","text":""},{"location":"assets/torw2020/presentation/#objectives","title":"Objectives:","text":"
Develop an fMRI preprocessing tool enforcing BIDS for the inputs
Automatically executable within OpenNeuro
"},{"location":"assets/torw2020/presentation/#initially-inspired-by-hcp-pipelines","title":"Initially inspired by HCP Pipelines","text":"
Problem: robustness vs. the wide variability of inputs
???
We began working on fMRIPrep back in 2016 with much more humble expectations: - We needed to develop an fMRI preprocessing tool leveraging BIDS - smart enough to adapt the workflow for the input dataset, - and the tool should be executable in OpenNeuro without human intervention.
Please note that at the time, the BIDS-Apps specification didn't exist yet.
We started out with an eye on HCP Pipelines, and soon identified that datasets in OpenNeuro varied extremely in terms of acquisition protocols and imaging parameters, which is definitely not a problem for HCP Pipelines, which has very specific requirements for the inputs.
"},{"location":"assets/torw2020/presentation/#fmriprep-adoption-and-popularization-brought-new-challenges","title":"fMRIPrep adoption and popularization brought new challenges","text":"
.pull-right[
]
???
With the fast adoption and popularization of fMRIPrep, new challenges surfaced.
On the right-hand side, you'll find the chart of unique visitors to fmriprep.org, which is the documentation website.
--
.pull-left[
"},{"location":"assets/torw2020/presentation/#transparency-was-addressed-with","title":"Transparency was addressed with:","text":"
the individual reports;
the thorough documentation; and
the citation boilerplate. ]
???
We realized that transparency is indeed a very hard problem. The first leg of our solution was the creation of a solid report system. fMRIPrep generates one individual report per participant, containing information not just to quality control the results, but also to understand the processing flow.
We also strived for a comprehensive, thorough documentation.
Finally, the so-called citation boilerplate appended to the individual reports describe the actual workflow that has been run, noting all the software that was applied including their versions and references.
--
.pull-left[
"},{"location":"assets/torw2020/presentation/#run-to-run-repeatability-is-an-open-issue","title":"Run-to-run repeatability is an open issue:","text":"
Reproducibility in terms of run-to-run repeatability of results become as a more apparent problem, and we are always trying to minimize the vibration caused by computational factors, software versions, etc.
massive amounts of bug reports, questioning the robustness
organic emergence of fMRIPrep enthusiasts (thanks to E. DuPre, JD. Kent) ]
???
We always maintained close attention to all the feedback channels. At some point we were washed over with bug reports that we needed to address. We also started to doubt the robustness against the variability of inputs, and set a thorough stress-test plan using data from OpenNeuro (reported in our Nat Meth paper). Among this feedback flooding, some external friends started to emerge and lent their shoulders in answering questions, fixing bugs, etc.
In particular, I want to thank Elizabeth DuPre (McGill) and James Kent (Univ. of Iowa) for being the earliest adopters and contributors.
"},{"location":"assets/torw2020/presentation/#fmriprep-is-stable-today-although-unfinished","title":"fMRIPrep is stable today, although unfinished","text":"
(Esteban et al., 2019)
???
These developments resulted in the following default processing workflow.
At the highest level, anatomical preprocessing (left-hand block) and functional preprocessing (right-hand block) can be clearly identified as the largest workflow units.
fMRIPrep combines all the anatomical images at the input in one anatomical reference, removes the intensity non-uniformity, delineates brain tissues, reconstructs surfaces, spatially normalizes the anatomical reference to one or more standard spaces.
On the functional pathway, a reference is calculated for further processes, then head-motion parameters are estimated (please note head-motion is accounted for in the last resampling step, in combination with other transforms), slice-timing correction is applied if requested.
Then, susceptibility distortion is estimated, if sufficient information (in terms of acquisition and metadata) is found in the BIDS structure.
Finally, data are mapped to the same individual's anatomical reference and outputs in the several output spaces requested are generated, along with a file gathering time-series of nuisance signals.
Let's walk through one example of report. Reports have several sections, starting with a summary indicating the particularities of this dataset and workflow choices made based on the input data.
The anatomical section follows with several visualizations to assess the anatomical processing steps mentioned before, spatial normalization to template spaces (the flickering panel helps assess alignment) and finally surface reconstruction.
Then, all functional runs are concatenated, and all show the same structure. After an initial summary of this particular run, the alignment to the same subject's anatomical image is presented, with contours of the white and pial surfaces as cues. Next panel shows the brain mask and ROIs utilized by the CompCor denoising. For each run we then find some visualizations to assess the generated confounding signals.
After all functional runs are presented, the About section keeps information to aid reproducibility of results, such as the software's version, or the exact command line run.
The boilerplate is found next, with a text version shown by default and tabs to convert to Markdown and LaTeX.
Reports conclude with a list of encountered errors (if any).
"},{"location":"assets/torw2020/presentation/#reports-are-a-crucial-element-to-ensure-transparency","title":"Reports are a crucial element to ensure transparency","text":"
.pull-left[
]
.pull-right[
.distribute[ fMRIPrep generates one participant-wide report after execution.
Reports describe the data as found, and the steps applied (providing .blue[visual support to look inside the box]):
show researchers their data;
show how fMRIPrep interpreted the data (describing the actual preprocessing steps);
quality control of results, facilitating early error detection. ] ]
???
Therefore, reports have become a fundamental feature of fMRIPrep because they not only allow assessing the quality of the processing, but also provide an insight about the logic supporting such processing.
In other words, reports help respond to the what was done and the why was it done in addition to the how well it did.
"},{"location":"assets/torw2020/presentation/#documentation-as-a-second-leg-of-transparency-fmripreporg","title":"Documentation as a second leg of transparency (fmriprep.org)","text":"
Hackathons & docu-sprints
the CompCor documentation example
.large[fmriprep.org]
???
We promptly identified the need for a very comprehensive documentation. The website at fmriprep.org covers a substantial area of how the tool works under the hood and how to best operate it.
The documentation turned out to be a great ice breaker for contributors, who have pushed forward fundamental sections of it.
Most of the largest increments in documentation are the result of discussions in hackathons, docusprints, neurostars, github, etc. A hallmark example was pull request 1877 by Karolina Finc, who gathered together a massive amount of knowledge from many contributors. Now this is up and open in our documentation website.
"},{"location":"assets/torw2020/presentation/#fmriprep-is-more-of-a-community-driven-project-every-day","title":"fMRIPrep is more of a community-driven project every day","text":"
Bug-fixes: we ensured that open feedback channels were attended (GitHub, NeuroStars, mailing list, etc.);
users began also proposing new features (some including code!);
with NiPreps we are working towards handling the project over to the community.
???
To ensure the future sustainability of the project (what some developers call Bus factor), we are transitioning the tool to NiPreps, transferring the large community nurtured over the past four years with it.
--
"},{"location":"assets/torw2020/presentation/#how-does-fmriprep-compensate-its-contributors","title":"How does fMRIPrep compensate its contributors?","text":"
Contributors are invited to coauthor relevant publications about fMRIPrep.
Anyone who helps with documentation, code or relevant discussions is a contributor.
.pull-left[
]
.pull-right[
]
???
In return, beyond the rewards of being part of an open source project, fMRIPrep gives some scientific credit back in the form of publications.
All contributors are invited to coauthor these publications.
Anything that helps the project is considered a sufficient contribution.
"},{"location":"assets/torw2020/presentation/#lessons-learned","title":"Lessons learned","text":""},{"location":"assets/torw2020/presentation/#researchers-want-to-spend-more-time-on-those-areas-most-relevant-to-them","title":"Researchers want to spend more time on those areas most relevant to them","text":"
(probably not preprocessing...)
???
With the development of fMRIPrep we understood that researchers don't want to waste their time on preprocessing (except for researchers developing new preprocessing techniques).
--
"},{"location":"assets/torw2020/presentation/#writing-fmriprep-required-a-team-of-several-experts-in-processing-methods-for-neuroimaging-with-a-solid-base-on-computer-science","title":"Writing fMRIPrep required a team of several experts in processing methods for neuroimaging, with a solid base on Computer Science.","text":"
(research programs just can't cover the neuroscience and the engineering of the whole workflow - we need to divide the labor)
???
The current neuroimaging workflow requires extensive knowledge in sometimes orthogonal fields such as neuroscience and computer science. Dividing the labor in labs, communities or individuals with the necessary expertise is the fundamental for the advance of the whole field.
--
"},{"location":"assets/torw2020/presentation/#transparency-helps-against-the-risk-of-super-easy-tools","title":"Transparency helps against the risk of super-easy tools","text":"
(easy-to-use tools are risky because they might get a researcher very far with no idea whatsoever of what they've done)
???
There is an implicit risk in making things too easy to operate:
For instance, imagine someone who runs fMRIPrep on diffusion data by tricking the BIDS naming into an apparently functional MRI dataset. If fMRIPrep reached the end at all, the garbage at the output could be fed into further tools, in a sort of a snowballing problem.
When researchers have access to the guts of the software and are given an opportunity to understand what's going on, the risk of misuse dips.
--
"},{"location":"assets/torw2020/presentation/#established-toolboxes-do-not-have-incentives-for-compatibility","title":"Established toolboxes do not have incentives for compatibility","text":"
(and to some extent this is not necessarily bad, as long as they are kept well-tested and they embrace/help-develop some minimal standards)
???
AFNI, ANTs, FSL, FreeSurfer, SPM, etc. have comprehensive software validation tests, methodological validation tests, stress tests, etc. - which pushed up their quality and made them fundamental for the field.
Therefore, it is better to keep things that way (although some minimal efforts towards convergence in compatibility are of course welcome)
The enormous success of fMRIPrep led us to propose its generalization to other MRI and non-MRI modalities, as well as nonhuman species (for instance, rodents), and particular populations currently unsupported by fMRIPrep such as infants.
"},{"location":"assets/torw2020/presentation/#augmenting-scanners-to-produce-analysis-grade-data","title":"Augmenting scanners to produce \"analysis-grade\" data","text":""},{"location":"assets/torw2020/presentation/#data-directly-consumable-by-analyses","title":"(data directly consumable by analyses)","text":"
.pull-left[
Analysis-grade data is an analogy to the concept of \"sushi-grade (or sashimi-grade) fish\" in that both are:
.large[minimally preprocessed,]
and
.large[safe to consume directly.] ]
.pull-right[ ]
???
The goal, therefore, of NiPreps is to extend the scanner so that, in a way, they produce data ready for analysis.
We liken these analysis-grade data to sushi-grade fish, because in both cases the product is minimally preprocessed and at the same time safe to consume as is.
For the last two years we've been decomposing the architecture of fMRIPrep, spinning off its constituent parts that are valuable in other applications.
This process of decoupling (to use a proper CS term) has been greatly facilitated by the modular nature of the code since its inception.
???
The processing elements extracted from fMRIPrep can be mapped to three regimes of responsibility:
Software infrastructure composed by tools ensuring the collaboration and the most basic tooling.
Middleware utilities, which build more advanced tooling based on the foundational infrastructure
And at the top of the stack end-user applications - namely fMRIPrep, dMRIPrep, sMRIPrep and MRIQC.
As we can see, the boundaries of these three architectural layers are soft and tools such as TemplateFlow may stand in between.
Only projects enclosed in the brain shape pertain to the NiPreps community. NiPype, NiBabel and BIDS are so deeply embedded as dependencies that NiPreps can't be understood without them.
BIDS provides a standard, guaranteeing I/O agreements:
Allows workflows to self-adapt to the inputs
Ensures the shareability of the results
PyBIDS: a Python tool to query BIDS datasets (Yarkoni et al., 2019):
>>> from bids import BIDSLayout\n\n# Point PyBIDS to the dataset's path\n>>> layout = BIDSLayout(\"/data/coolproject\")\n\n# List the participant IDs of present subjects\n>>> layout.get_subjects()\n['01', '02', '03', '04', '05']\n\n# List session identifiers, if present\n>>> layout.get_sessions()\n['01', '02']\n\n# List functional MRI tasks\n>>> layout.get_tasks()\n['rest', 'nback']\n
???
BIDS is one of the keys to success for fMRIPrep and consequently, a strategic element of NiPreps.
Because the tools so far are written in Python, PyBIDS is a powerful tool to index and query inputs and outputs.
The code snippet illustrates the ease to find out the subject identifiers available in the dataset, sessions, and tasks.
All NiPreps must write out BIDS-Derivatives. As illustrated in the example, the outputs of fMRIPrep are very similar to the BIDS standard for acquired data.
All end-user applications in NiPreps must conform to the BIDS-Apps specifications.
The BIDS-Apps paper identified a common pattern in neuroimaging studies, where individual participants (and runs) are processed first individually, and then based on the outcomes, further levels of data aggregation are executed.
For this reason, BIDS-Apps define two major levels of execution: participant and group level.
Finally, the paper also stresses the importance of containerizing applications to ensure long-term preservation of run-to-run repeatability and proposes a common command line interface as described at the bottom:
first the name of the BIDS-Apps (fmriprep, in this case)
followed by input and output directories (respectively),
to finally indicate the analysis level (always participant, for the case of fmriprep)
.pull-left[
from nipype.interfaces.fsl import BET\nbrain_extract = BET(\n in_file=\"/data/coolproject/sub-01/ses-01/anat/sub-01_ses-01_T1w.nii\",\n out_file=\"/out/sub-01/ses-01/anat/sub-01_ses-01_desc-brain_T1w.nii\"\n)\nbrain_extract.run()\n
Nipype is the gateway to mix-and-match from AFNI, ANTs, Dipy, FreeSurfer, FSL, MRTrix, SPM, etc. ]
.pull-right[
]
???
Nipype is the glue stitching together all the underlying neuroimaging toolboxes and provides the execution framework.
The snippet shows how the widely known BET tool from FSL can be executed using NiPype. This is a particular example instance of interfaces - which provide uniform access to the tooling with Python.
Finally, combining these interfaces we generate processing workflows to fulfill higher level processing tasks.
???
For instance, we may have a look into fMRIPrep's functional processing block.
Nipype helps understand (and opens windows in the black box) generating these graph representation of the workflow.
\"\"\"Fix the affine of a rodent dataset, imposing 0.2x0.2x0.2 [mm].\"\"\"\nimport numpy as np\nimport nibabel as nb\n\n# Open the file\nimg = nb.load(\"sub-25_MGE_MouseBrain_3D_MGE_150.nii.gz\")\n\n# New (correct) affine\naff = np.diag((-0.2, -0.2, 0.2, 1.0))\n\n# Use nibabel to reorient to canonical\ncard = nb.as_closest_canonical(nb.Nifti1Image(\n img.dataobj,\n np.diag((-0.2, -0.2, 0.2, 1.0)),\n None\n))\n\n# Save to disk\ncard.to_filename(\"sub-25_T2star.nii.gz\")\n
???
NiBabel allows Python to easily access neuroimaging data formats such as NIfTI, GIFTI and CIFTI2.
Although this might be a trivial task, the proliferation of neuroimaging software has led to some sort of Wild West of formats, and sometimes interoperation is not ensured.
"},{"location":"assets/torw2020/presentation/#in-the-snippet-we-can-see-how-we-can-manipulate-the-orientation-headers-of-a-nifti-volume-in-particular-a-rodent-image-with-incorrect-affine-information","title":"In the snippet, we can see how we can manipulate the orientation headers of a NIfTI volume, in particular a rodent image with incorrect affine information.","text":"
.pull-left[
Transforms typically are the outcome of image registration methodologies
The proliferation of software implementations of image registration methodologies has resulted in a spread of data structures and file formats used to preserve and communicate transforms.
(Esteban et al., 2020) ]
.pull-right[
]
???
NiTransforms is a super-interesting toy project where we are exercising our finest coding skills. It completes NiBabel in the effort of making spatial transforms calculated by neuroimaging software tools interoperable.
When it goes beyond the alpha state, it is expected to be merged into NiBabel.
At the moment, NiTransforms is already integrated in fMRIPrep +20.1 to concatenate LTA (linear affine transforms) transforms obtained with FreeSurfer, ITK transforms obtained with ANTs, and motion parameters estimated with FSL.
Compatibility across formats is hard due to the many arbitrary decisions in establishing the mathematical framework of the transform and the intrinsic confusion of applying a transform.
While intuitively we understand applying a transform as \"transforming the moving image so that I can represent it overlaid or fused with the reference image and both should look aligned\", in reality, we only transform coordinates from the reference image into the moving image's space (step 1 on the right).
Once we know where the center of every voxel of the reference image falls in the moving image coordinate system, we read in the information (in other words, a value) from the moving image. Because the location will probably be off-grid, we interpolate such a value from the neighboring voxels (step 2).
Finally (step 3) we generate a new image object with the structure of the reference image and the data interpolated from the moving information. This new image object is the moving image \"moved\" on to the reference image space and thus, both look aligned.
.pull-left[
The Archive (right) is a repository of templates and atlases
The Python Client (bottom) provides easy access (with lazy-loading) to the Archive
>>> from templateflow import api as tflow\n>>> tflow.get(\n... 'MNI152NLin6Asym',\n... desc=None,\n... resolution=1,\n... suffix='T1w',\n... extension='nii.gz'\n... )\nPosixPath('/templateflow_home/tpl-MNI152NLin6Asym/tpl-MNI152NLin6Asym_res-01_T1w.nii.gz')\n
.large[www.templateflow.org] ]
.pull-right[
]
???
One of the most ancient feature requests received from fMRIPrep early adopters was improving the flexibility of spatial normalization to standard templates other than fMRIPrep's default.
For instance, infant templates.
TemplateFlow offers an Archive of templates where they are stored, maintained and re-distributed;
and a Python client that helps accessing them.
On the right hand side, an screenshot of the TemplateFlow browser shows some of the templates currently available in the repository. The browser can be reached at www.templateflow.org.
The tool is based on PyBIDS, and the snippet will surely remind you of it. In this case the example shows how to obtain the T1w template corresponding to FSL's MNI space, at the highest resolution.
If the files requested are not in TemplateFlow's cache, they will be pulled down and kept for further utilization.
The Archive allows a rich range of data and metadata to be stored with the template.
Datatypes in the repository cover:
images containing population-average templates,
masks (for instance brain masks),
atlases (including parcellations and segmentations)
transform files between templates
Metadata can be stored with the usual BIDS options.
Finally, templates allow having multiple cohorts, in a similar encoding to that of multi-session BIDS datasets.
Multiple cohorts are useful, for instance, in infant templates with averages at several gestational ages.
NiWorkflows is a miscellaneous mixture of tooling used by downstream NiPreps:
???
NiWorkflows is, historically, the first component detached from fMRIPrep.
For that reason, its scope and vision has very fuzzy boundaries as compared to the other tools.
The most relevant utilities incorporated within NiWorkflows are:
--
The reportlet aggregation and individual report generation system
???
First, the individual report system which aggregates the visual elements or the reports (which we call \"reportlets\") and generates the final HTML document.
Also, most of the engineering behind the generation of these reportlets and their integration within NiPype are part of NiWorkflows
--
Custom extensions to NiPype interfaces
???
Beyond the extension of NiPype to generate a reportlet from any given interface, NiWorkflows is the test bed for many utilities that are then upstreamed to nipype.
Also, special interfaces with a limited scope that should not be included in nipype are maintained here.
--
Workflows useful across applications
???
Finally, NiWorkflows indeed offers workflows that can be used by end-user NiPreps. For instance atlas-based brain extraction of anatomical images, based on ANTs.
???
Echo-planar imaging (EPI) are typically affected by distortions along the phase encoding axis, caused by the perturbation of the magnetic field at tissue interfaces.
Looking at the reportlet, we can see how in the \"before\" panel, the image is warped.
The distortion is most obvious in the coronal view (middle row) because this image has posterior-anterior phase encoding.
Focusing on the changes between \"before\" and \"after\" correction in this coronal view, we can see how the blue contours delineating the corpus callosum fit better the dark shade in the data after correction.
"},{"location":"assets/torw2020/presentation/#sdcflows-as-integrated-in-fmriprep","title":"SDCFlows, as integrated in fMRIPrep","text":"
REQUIRES (opts. 1 or 2): setting the IntendedFor metadata field of fieldmaps. ]
???
With SDCFlows, fMRIPrep implements a rather sophisticated pipeline for the estimation of susceptibility distortions.
Depending on whether the input dataset contains EPI images with opposed phase encoding polarities (the so-called PE-Polar correction), fieldmaps (as Gradient Recalled Echo sequences) or the fieldmap-less estimation is requested,
then SDCFlows establishes a hierarchy of corrections.
After correction, we are interested in assessing that low-frequency distortions have been accounted for and that high-frequency (with extreme regions suffering severe drop-outs) are not excessively present.
.pull-left[
] .pull-right[
]
???
sMRIPrep corresponds to the split of the anatomical preprocessing workflow originally proposed with fMRIPrep.
With the support of TemplateFlow, the tool now supports spatial normalization to one or more templates found in the TemplateFlow Archive.
It also supports the use of custom templates, whenever they are correctly installed in the templateflow's cache folder.
???
dMRIPrep and fMRIPrep are, of course the tip of the iceberg.
dMRIPrep is still in an alpha state, steadily progressing through the path fMRIPrep has delineated for NiPreps.
Hopefully, at this point of the talk fMRIPrep doesn't need further description.
template: newsection layout: false
.middle.center[
"},{"location":"assets/torw2020/presentation/#other-components-of-nipreps","title":"Other components of NiPreps","text":"
]
???
Some additional components of NiPreps were never part of fMRIPrep's codebase, or they have been started recently.
???
Such is the case of the quality control tools.
MRIQC produces visual reports for the efficient screening of acquired (meaning, unprocessed) data - in particular anatomical and functional MRI of the human brain.
CrowdMRI is an internet service where anonymized quality control metrics are uploaded automatically as they are computed by MRIQC.
The endgoal is to gather enough data to describe the normative distribution of these metrics across image parameters and scanning devices and sites.
Finally, MRIQCnets encloses several machine learning projects regarding the quality of acquired images.
"},{"location":"assets/torw2020/presentation/#upcoming-new-utilities","title":"Upcoming new utilities","text":""},{"location":"assets/torw2020/presentation/#nibabies","title":"NiBabies","text":"
Recently started, covering infant MRI brain-extraction for now (Mathias Goncalves)
Recently started, covering rodent MRI brain-extraction for now (Eilidh MacNicol)
???
So, what's coming up next?
NiBabies is some sort of NiWorkflows equivalent for the preprocessing of infant imaging. At the moment, only atlas-based brain extraction using ANTs (and adapted from NiWorkflows) is in active developments.
Next steps include brain tissue segmentation.
Similarly, NiRodents is the NiWorkflows parallel for the prepocessing of rodent preclinical imaging. Again, only atlas-based brain extraction adapted from NiWorkflows is being developed.
In a mid-term future, both NiBabies and NiRodents should allow the extension of fMRIPrep to these new two idiosyncratic data families.
In additions, plans for a molecular imaging or PET preprocessing NiPrep are being designed.
"},{"location":"assets/torw2020/presentation/#conclusion","title":"Conclusion","text":""},{"location":"assets/torw2020/presentation/#nipreps-is-a-framework-for-the-development-of-preprocessing-workflows","title":"NiPreps is a framework for the development of preprocessing workflows","text":"
Principled design, with BIDS as an strategic component
Leveraging existing, widely used software
Using NiPype as a foundation
???
To wrap-up, I've presented NiPreps, a framework for developing preprocessing workflows inspired by fMRIPrep.
The framework is heavily principle and tags along BIDS as a foundational component
NiPreps should not reinvent any wheel, trying to reuse as much as possible of the widely used and tested existing software.
Nipype serves as a glue components to orchestrate workflows.
We propose to consider preprocessing as part of the image acquisition and reconstruction
When setting the boundaries that way, it seems sensible to pursue some standardization in the preprocessing:
Less experimental degrees of freedom for the researcher
Researchers can focus on the analysis
More homogeneous data at the output (e.g., for machine learning)
How:
Transparency is key to success: individual reports and documentation (open source is implicit).
Best engineering practices (e.g., containers and CI/CD)
???
But why just preprocessing, with a very strict scope?
We propose to think about preprocessing as part of the image acquisition and reconstruction process (in other words, scanning), rather than part of the analysis workflow.
This decoupling from analysis comes with several upshots:
First, there are less moving parts to play with for researchers in the attempt to fit their methods to the data (instead of fitting data with their methods).
Second, such division of labor allows the researcher to use their time in the analysis.
Finally, two preprocessed datasets from two different studies and scanning sites should be more homogeneous when processed with the same instruments, in comparison to processing them with idiosyncratic, lab-managed, preprocessing workflows.
However, for NiPreps to work we need to make sure the tools are transparent.
Not just with the individual reports and thorough documentation, also because of the community driven development. For instance, the peer-review process that goes around large incremental changes is fundamental to ensure the quality of the tool.
In addition, best engineering practices suggested in the BIDS-Apps paper, along with those we have been including with fMRIPrep, are necessary to ensure the quality of the final product.
As an open problem, validating the results of the tool remains extremely challenging for the lack in gold standard datasets that can tell us the best possible outcome.
"},{"location":"community/","title":"Join the NiPreps Community","text":"
One of the pillars of fMRIPrep, the seed project for NiPreps, has been nurturing an open-source community. Building Welcoming Communities is crucial for open-source software because of several reasons:
Engaging users and contributors (in a very liberal sense, not just with code) helps establish a development road-map:
In the case of fMRIPrep, many users have reported bugs via our issue tracker and Neurostars.org. Even though testing is one of the primary focuses for fMRIPrep, without these bug-report contributions the tool would have never reached the dependability level it requires to serve its purpose.
Users identify and propose new features, often illuminating shady areas the most involved developers did not find time or the right context to explore.
The community exposes the software and also increases the externality of the software. The neuroimaging discussion supported by Neurostars.org has been a key factor for the adoption of fMRIPrep.
Users always give back, and it is not uncommon to see elaborate responses to bug-reports and questions about fMRIPrep on Neurostars.org by users who had similar questions previously.
Because of the scientific purpose of NiPreps, there is one more fundamental reason to grow a (scientific) community around the tools: rigor/scrutiny. As one reviews a few of the most discussed pull-requests to fMRIPrep, very soon they realize that we don't just need to get the code right. We strive for integrating high-quality code, but even more importantly, that code must get the scientific method it implements right. This is particularly difficult because in most of the cases there aren't test oracles (in software engineering terms) or gold-standards (in scientific terms) to efficiently evaluate the validity of new features (even to exercise a minuscule area of the domain of inputs). The redundancy of expert eyes looking at our code has only helped make it better.
"},{"location":"community/#current-members-of-the-github-organization","title":"Current members of the GitHub organization","text":"
A total of 100 neuroimagers have already joined us. Becoming a member will give you access to additional forums for discussion, subscribing notifications for events and meetings, etc. You can request you are added to the organization by creating a new issue here.
"},{"location":"community/CODE_OF_CONDUCT/","title":"NiPreps Code of Conduct","text":""},{"location":"community/CODE_OF_CONDUCT/#our-pledge","title":"Our Pledge","text":"
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socioeconomic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting Oscar Esteban at oesteban@stanford.edu or Chris Markiewicz at markiewicz@stanford.edu, two members of the project team. All complaints will be reviewed and investigated and will result in a response that is deemed necessary and appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership.
This Code of Conduct is adapted from the Contributor Covenant, version 1.4, available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
For answers to common questions about this code of conduct, see https://www.contributor-covenant.org/faq
Welcome to the NiPreps project! We're excited you're here and want to contribute.
Imposter's syndrome disclaimer
Imposter's syndrome disclaimer1: We want your help. No, really.
There may be a little voice inside your head that is telling you that you're not ready to be an open-source contributor; that your skills aren't nearly good enough to contribute. What could you possibly offer a project like this one?
We assure you - the little voice in your head is wrong. If you can write code at all, you can contribute code to open-source. Contributing to open-source projects is a fantastic way to advance one's coding skills. Writing perfect code isn't the measure of a good developer (that would disqualify all of us!); it's trying to create something, making mistakes, and learning from those mistakes. That's how we all improve, and we are happy to help others learn.
Being an open-source contributor doesn't just mean writing code, either. You can help out by writing documentation, tests, or even giving feedback about the project (and yes - that includes giving feedback about the contribution process). Some of these contributions may be the most valuable to the project as a whole, because you're coming to the project with fresh eyes, so you can see the errors and assumptions that seasoned contributors have glossed over.
NiPreps are built around three overarching principles:
Robustness - The pipeline adapts the preprocessing steps depending on the input dataset and should provide results as good as possible independently of scanner make, scanning parameters or presence of additional correction scans (such as fieldmaps).
Ease of use - Thanks to dependence on the BIDS standard, manual parameter input is reduced to a minimum, allowing the pipeline to run in an automatic fashion.
\"Glass box\" philosophy - Automation should not mean that one should not visually inspect the results or understand the methods. Thus, NiPreps provides visual reports for each subject, detailing the accuracy of the most important processing steps. This, combined with the documentation, can help researchers to understand the process and decide which subjects should be kept for the group level analysis.
These principles distill some design and organizational foundations:
NiPreps only and fully support BIDS and BIDS-Derivatives for the input and output data.
NiPreps are packaged as a fully-compliant BIDS-Apps, not just in its user interface, but also in the continuous integration, testing, and delivery.
The scope of NiPreps is strictly limited to preprocessing tasks.
NiPreps are agnostic to subsequent analysis, i.e., any software supporting BIDS-Derivatives for its inputs should be amenable to analyze data preprocessed with them.
NiPreps are thoroughly and transparently documented (including the generation of individual, visual reports with a consistent format that serve as scaffolds for understanding the underpinnings and design decisions).
NiPreps are community-driven, and contributors (in any sense) always get credited with authorship within relevant publications.
NiPreps are modular, reliant on widely-used tools such as AFNI, ANTs, FreeSurfer, FSL, NiLearn, or DIPY [7-12] and extensible via plug-ins.
"},{"location":"community/CONTRIBUTING/#practical-guide-to-submitting-your-contribution","title":"Practical guide to submitting your contribution","text":"
These guidelines are designed to make it as easy as possible to get involved. If you have any questions that aren't discussed below, please let us know by opening an issue!
Before you start, you'll need to set up a free GitHub account and sign in. Here are some instructions.
Already know what you're looking for in this guide? Jump to the following sections:
Joining the conversation
Contributing through Github
Understanding issues
Making a change
Structuring contributions
Licensing
Recognizing contributors
"},{"location":"community/CONTRIBUTING/#joining-the-conversation","title":"Joining the conversation","text":"
NiPreps is maintained by a growing group of enthusiastic developers\u2014 and we're excited to have you join! Most of our discussions will take place on open issues.
We also encourage users to report any difficulties they encounter on NeuroStars, a community platform for discussing neuroimaging.
We actively monitor both spaces and look forward to hearing from you in either venue!
"},{"location":"community/CONTRIBUTING/#contributing-through-github","title":"Contributing through GitHub","text":"
git is a really useful tool for version control. GitHub sits on top of git and supports collaborative and distributed working.
If you're not yet familiar with git, there are lots of great resources to help you git started! Some of our favorites include the git Handbook and the Software Carpentry introduction to git.
On GitHub, You'll use Markdown to chat in issues and pull requests. You can think of Markdown as a few little symbols around your text that will allow GitHub to render the text with a little bit of formatting. For example, you could write words as bold (**bold**), or in italics (*italics*), or as a link ([link](https://youtu.be/dQw4w9WgXcQ)) to another webpage.
GitHub has a really helpful page for getting started with writing and formatting Markdown on GitHub.
Every project on GitHub uses issues slightly differently.
The following outlines how the NiPreps developers think about these tools.
Issues are individual pieces of work that need to be completed to move the project forward. A general guideline: if you find yourself tempted to write a great big issue that is difficult to describe as one unit of work, please consider splitting it into two or more issues.
Issues are assigned labels which explain how they relate to the overall project's goals and immediate next steps.
The current list of issue labels are here and include:
These issues contain a task that is amenable to new contributors because it doesn't entail a steep learning curve.
If you feel that you can contribute to one of these issues, we especially encourage you to do so!
These issues point to problems in the project.
If you find new a bug, please give as much detail as possible in your issue, including steps to recreate the error. If you experience the same bug as one already listed, please add any additional information that you have as a comment.
These issues are asking for new features and improvements to be considered by the project.
Please try to make sure that your requested feature is distinct from any others that have already been requested or implemented. If you find one that's similar but there are subtle differences, please reference the other request in your issue.
In order to define priorities and directions in the development roadmap, we have two sets of special labels:
Label Description Estimation of the downstream impact the proposed feature/bugfix will have. Estimation of effort required to implement the requested feature or fix the reported bug.
One way to understand these labels is to consider how they would apply to an imaginary issue. For example, if -- after a release -- a bug is identified that re-introduces a previously solved issue (i.e., its regresses the code outputs to some undesired behavior), we might assign it the following labels: . Its development priority would then be \"high\", since it is a low-effort, high-impact change.
Long-term goals may be labelled as a combination of: and or since they will have a high-impact on the code-base, but require a medium or high amount of effort. Of note, issues with the labels: or are less likely to be addressed because they are less likely to impact the code-base, or because they will require a very high activation energy to do so.
"},{"location":"community/CONTRIBUTING/#making-a-change","title":"Making a change","text":"
We appreciate all contributions to NiPreps, but those accepted fastest will follow a workflow similar to the following:
Comment on an existing issue or open a new issue referencing your addition. This allows other members of the NiPreps development team to confirm that you aren't overlapping with work that's currently underway and that everyone is on the same page with the goal of the work you're going to carry out. This blog is a nice explanation of why putting this work in up front is so useful to everyone involved.
Fork the particular NiPrep repository (e.g., fMRIPrep) with your GitHub user. This is now your own unique copy of that particular NiPreps component. Changes here won't affect anyone else's work, so it's a safe space to explore edits to the code!
Clone your forked NiPreps repository to your machine/computer. While you can edit files directly on github, sometimes the changes you want to make will be complex and you will want to use a text editor that you have installed on your local machine/computer. (One great text editor is vscode). In order to work on the code locally, you must clone your forked repository. To keep up with changes in the NiPreps repository, add the \"upstream\" NiPreps repository as a remote to your locally cloned repository.
Create a new branch to develop and maintain the proposed code changes. For example:
git fetch upstream # Always start with an updated upstream\ngit checkout -b fix/bug-1222 upstream/master\n
Please consider using appropriate branch names as those listed below, and mind that some of them are special (e.g., doc/ and docs/):
fix/<some-identifier>: for bugfixes
enh/<feature-name>: for new features
doc/<some-identifier>: for documentation improvements. You should name all your documentation branches with the prefix doc/ or docs/ as that will preempt triggering the full battery of continuous integration tests.
Make the changes you've discussed, following the NiPreps coding style guide. Try to keep the changes focused: it is generally easy to review changes that address one feature or bug at a time. It can also be helpful to test your changes locally, using a NiPreps development environment. Once you are satisfied with your local changes, add/commit/push them to the branch on your forked repository.
Submit a pull request. A member of the development team will review your changes to confirm that they can be merged into the main code base. Pull request titles should begin with a descriptive prefix (for example, ENH: Support for SB-reference in multi-band datasets):
ENH: enhancements or new features (example)
FIX: bug fixes (example)
TST: new or updated tests (example)
DOC: new or updated documentation (example)
STY: style changes (example)
REF: refactoring existing code (example)
CI: updates to continuous integration infrastructure (example)
MAINT: general maintenance (example)
For works-in-progress, add the WIP tag in addition to the descriptive prefix. Pull-requests tagged with WIP: will not be merged until the tag is removed.
Have your PR reviewed by the developers team, and update your changes accordingly in your branch. The reviewers will take special care in assisting you address their comments, as well as dealing with conflicts and other tricky situations that could emerge from distributed development.
Whenever possible, instances of Nipype Nodes and Workflows should use the same names as the variables they are assigned to. This makes it easier to relate the content of the working directory to the code that generated it when debugging.
Workflow variables should end in _wf to indicate that they refer to Workflows and not Nodes. For instance, a workflow whose basename is myworkflow might be defined as follows:
from nipype.pipeline import engine as pe\n\nmyworkflow_wf = pe.Workflow(name='myworkflow_wf')\n
If a workflow is generated by a function, the name of the function should take the form init_<basename>_wf:
If multiple instances of the same workflow might be instantiated in the same namespace, the workflow names and variables should include either a numeric identifier or a one-word description, such as:
We welcome and recognize all contributions regardless their size, content or scope: from documentation to testing and code development. You can see a list of current developers and contributors in our zenodo file. Before every release, a new zenodo file will be generated. The update script will also sort creators and contributors by the relative size of their contributions, as provided by the git-line-summary utility distributed with the git-extras package. Last positions in both the creators and contributors list will be reserved to the project leaders. These special positions can be revised to add names by punctual request and revised for removal and update of ordering in an scheduled manner every two years. All the authors enlisted as creators participate in the revision of modifications.
Anyone listed as a developer or a contributor can start the submission process of a manuscript as first author (please see Membership, where these concepts are described). To compose the author list, all the creators MUST be included (except for those people who opt to drop-out) and all the contributors MUST be invited to participate. First authorship(s) is (are) reserved for the authors that originated and kept the initiative of submission and wrote the manuscript. To generate the ordering of your paper, please run python .maint/paper_author_list.py from the root of the repository, on the up-to-date upstream/master branch. Then, please modify this list and place your name first. All developers and contributors are pulled together in a unique list, and last authorships assigned. NiPreps and its community adheres to open science principles, such that a pre-print should be posted on an adequate archive service (e.g., ArXiv or BioRxiv) prior publication.
NiPreps is licensed under the Apache 2.0 license. By contributing to NiPreps, you acknowledge that any contributions will be licensed under the same terms.
\u2014 Based on contributing guidelines from the STEMMRoleModels project.
The imposter syndrome disclaimer was originally written by Adrienne Lowe for a PyCon talk, and was adapted based on its use in the README file for the MetPy project.\u00a0\u21a9
The one bit that worries me is that fMRIPrep may become a Swiss army knife. I think instead it should just be a paring knife (small, efficient, and works for many things).
-- Satra (source)
When projects grow large, many forking paths created by newly implemented features start to open up. To account for this, the NiPreps community was created with the vision of building tools like fMRIPrep and MRIQC covering new imaging modalities, while keeping existing NiPreps tightly within scope. Defining such a scope also aids the implementation of the ease-of-use principle:
The same way the scanner does not offer an immense space of knobs to turn in the acquisition, NiPreps should not add many additional knobs to those for them to be considered a viable augmentation or extension of the scanner hw/sw.
-- Oscar (source)
"},{"location":"community/features/#the-problem-of-feature-creep","title":"The problem of feature creep","text":"
To avert feature creep and to serve each individual NiPrep, we developed the following guidelines, with the hopes of keeping these tools in a healthy state.
I'm worried fMRIPrep is catching a case of featuritis
-- Mathias (source)
These guidelines should also serve the community to transparently drive the process of including proposals into the road-map, set the ground for healthy conversation, and establish some patterns when accepting new-feature contributions. Before proposing new features, please be mindful that a road-map may not exist for a particular NiPrep. Even when a development road-map exists, please understand that it is not always possible to rigorously follow them:
I think something like this is what we tried to start sketching out with the development roadmap. The concern, as I remember it, was that we couldn't guarantee (or rule out) specific features when working with a small development team.
-- Elizabeth (source).
"},{"location":"community/features/#proposing-a-new-feature","title":"Proposing a new feature","text":""},{"location":"community/features/#why-the-new-feature-is-requested","title":"Why the new feature is requested?","text":"
Before going ahead and proposing a new feature, please take some time to learn whether the topic has been covered in the past and what decisions were made and why. This should be reasonably easy to do with the search tool of GitHub on the particular NiPrep repository.
If no previous discussion about the new idea is found, the next step is ensuring the new feature aligns with the vision and the scope of the target tool, as Elizabeth points out. Taking a look into the Development Road-map of the particular project (if it exists), may help finding an answer.
If the new feature still seems pertinent after this preliminary work or you are unsure about whether it falls within the scope, then go ahead and post an issue requesting feedback on your proposal. Please make sure to clearly state why the new feature should be considered.
"},{"location":"community/features/#some-questions-will-always-be-asked-about-a-new-feature","title":"Some questions will always be asked about a new feature","text":"
These questions by James will certainly help build up the discourse in support of the new feature, as the NiPreps maintainers will consider them:
Is the user interface affected? Because NiPreps generally expose a command-line interface (CLI) for the interaction with the user, new features involving changes to the CLI must be considered with caution as they may harm the ease-of-use:
It also seems that some new features add more confusion than others. Especially when the CLI is affected, and yet another option is added, that makes the tool more complex to use.
-- Alejandro (source).
Does the new feature substantially increase the internal complexity? Maintainers and developers will attempt to consolidate tools and lower the internal complexity whenever possible. This effort usually competes with the addition of new features as they typically will address particular use-cases rather than general improvements. However, that doesn't need to be the case, as some sections of the code might be objectively improvable and the integration of a new feature revising those might also lower complexity. Lowering the internal complexity will always be considered a great incentive for a new feature to be accepted.
Is there a standard procedure for the proposed feature in the literature?
if so, could we just use that procedure/value?
Is the feature dependent on some attribute of the input data? (e.g., TR, duration, etc.)
if so, can the procedure/value be determined algorithmically?
Does the feature interact with other settings? For instance, fmriprep#1962 interacts with the a/tCompCor implementation.
What is the difficulty of implementing the procedure outside of a NiPrep? In other words, does the NiPrep provide all the necessary outputs for a user to perform the non-standard analysis?
"},{"location":"community/features/#how-the-integration-of-the-new-feature-willcan-be-validated","title":"How the integration of the new feature will/can be validated?","text":"
Please propose ways to validate the new feature in the context of the workflow. Meaning, the objective here is to validate that the new feature works well within the pipeline, rather than validating a specific algorithm. To ensure the sustainability of NiPreps, the onus of this validation should be on the person/group requesting the feature.
"},{"location":"community/licensing/","title":"Licensing and Derived Works","text":"
The NiPreps community believes that software is an integral component of scientific practice, and that any scientific claim must be verifiable by following the chain of reasoning from observation to conclusion. To achieve this, software must be free to use, inspect, and critique. We also believe that you should be free to modify our software to improve it or adapt it to new use cases.
As software development is a dynamic process, code modifications can quickly become confusing as the original and modified versions depart from each other. For the sake of transparency and verification, when you modify our code, we ask that you document both the version of the software that you started with and the changes you make.
We believe these freedoms are best promoted by distributing our software under free/open source software licenses, and the license we feel best promotes these goals is the Apache License, Version 2.0.
This page outlines our commitment to transparent development and our expectations for developers who adapt NiPreps code to use in other projects.
"},{"location":"community/licensing/#licensing-of-nipreps-projects","title":"Licensing of NiPreps projects","text":"
All software packages and tools under the NiPreps umbrella must be licensed under the Apache License 2.0 by default, unless otherwise stated. The authors of new NiPreps packages may not abide by this general rule of thumb if necessary and/or sufficiently justified (e.g., the source code is actually derived from a product licensed under a copyleft license).
Containerized Images bundling NiPreps components and their dependencies can be distributed under a free and open-source license without copyleft, such as the MIT License. In such a case, the attribution notice of the MIT license must be present in the header comment of the container image bootstrapping file (for instance, the so-called Dockerfile). This different licensing must be also indicated in the NOTICE file of the corresponding NiPreps components bundled within the image.
Docker-wrappers such as the fmriprep-docker package may be licensed under any free and open-source license without copyleft, such as the MIT License. This different licensing must be also indicated in the NOTICE file of the corresponding NiPreps components bundled within the image.
Data (distributed within the test data of packages or through the nipreps-data GitHub organization) will preferably be distributed under the Creative Commons Zero v1.0 Universal.
Under no circumstances any NiPreps software or data will be made publicly available unlicensed. If you find any component of NiPreps that is unlicensed, please make us aware at nipreps@gmail.com at your earliest convenience.
(This section is adapted from this blog post by D. Mar\u00edn)
The Apache License was created by the Apache Software Foundation (ASF) as the license for its Apache HTTP Server.
Just as the MIT License, it\u2019s a very permissive non-copyleft license that allows using the software for any purpose, distributing it, modifying it, and distributing derived works of it without concern for royalties. Its main differences, compared to the MIT License, are:
Using the Apache License, the authors of the software grant patent licenses to any user or distributor of the code. This patent licenses apply to any patent that, being licenseable by any of the software author, would be infringed by the piece of code they have created.
Apache License required that unmodified parts in derived works keep the License.
In every licensed file, any original copyright, patent, trademark or attribution notices must be preserved.
In every licensed file change, there must be a notification stating that changes have been made in the file.
If the Apache-licensed software includes a NOTICE file, this file and its contents must be preserved in all the derived works.
If anyone intentionally sends a contribution for an Apache-licensed software to its authors, this contribution can automatically be used under the Apache License.
This license is interesting because of the automatic patent license, and the clause about contribution submission.
It\u2019s compatible with the GPL, so you can mix Apache licensed-code into GPL software.
In the case of scientific software, we believe that clearly stating that a Derived Work introduces changes into the original Work is a fundamental measure of transparency. Other than that, we wanted a permissive, non-copyleft license.
"},{"location":"community/licensing/#what-is-our-expectation-for-derived-works","title":"What is our expectation for Derived Works?","text":"
At the bare minimum, you must meet the conditions of the license (simplified version) about preserving the license text and copyright/attribution notices as well as corresponding statements of changes.
How to state that a file has been changed in a Derived Work. We suggest the following steps, heavily influenced by P. Ombredanne's recommendations at StackExchange:
In each source file, add a note to the header comment stating that the file has been modified, with an approximate date, and a high-level description of the changes. The date and the description of the changes are not strictly required, but they are positive etiquette from a software engineering standpoint and substantially improve the transparency of the changes from a scientific point of view.
If the source file did not have a license notice in the header comment, please add it to avoid ambiguity.
Deleted files: please keep the file with just the header comment and state that the file is deleted. The change statement should follow the suggestion in 1), preferably stating whether the source has been deleted or moved over to other files. If preserving the filename as-is might become confusing to the user of the Derived Work, the filename can be modified to be marked as hidden with a dot . or underscore _ prefix, or modifying the extension.
Preferably, also include a link to the original file in our GitHub repository, making sure the link is done to a particular commit state.
What changes would we like to see annotated? The high-level description of the changes will preferably contain:
Correction of bugs
Substantial performance improvement decisions
Replacement of relevant methods and dependencies by alternatives
Changes to the license
"},{"location":"community/licensing/#example-of-our-expectations","title":"Example of our expectations","text":"
Let's say a Derived Work modifies the sdcflows.viz.utils code-base. The file may or may not have the attribution notice. At the time of writing, the header comment of this file is:
Header comment in the original Work
With attribution noticeWithout attribution notice
# emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-\n# vi: set ft=python sts=4 ts=4 sw=4 et:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
Either way (whether the attribution notice is present or not), we suggest to update this header comment to something along the lines of the following:
Suggested header comment in the Derived Work
RequiredRecommended (commit)Recommended (version)
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found at:\n# https://github.com/nipreps/sdcflows/blob/50393a8584dd0abf5f8e16e6ba66c43e1126f844/sdcflows/viz/utils.py\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with yellow color are explicitly required by the Apache-2.0 conditions.
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found at:\n# https://github.com/nipreps/sdcflows/blob/50393a8584dd0abf5f8e16e6ba66c43e1126f844/sdcflows/viz/utils.py\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with green color are recommended by the NiPreps Developers.
# <shebang and editor settings can be preserved or removed freely>\n#\n# <your attribution notice, either maintaining the Apache-2.0 license or changing the license>\n#\n# STATEMENT OF CHANGES: This file is derived from sources licensed under the Apache-2.0 terms,\n# and this file has been changed.\n# The original file this work derives from is found within\n# the version 2.0.2 distribution of the software.\n#\n# [April 2021] CHANGES:\n# * BUGFIX: Outdated function call from the ``svgutils`` dependency that changed API as of version 0.3.2.\n# * ENH: Changed plotting dependency to the new `netplotbrain` package.\n# * DOC: Added docstrings to some functions that lacked them.\n#\n# ORIGINAL WORK'S ATTRIBUTION NOTICE:\n#\n# Copyright 2021 The NiPreps Developers <nipreps@gmail.com>\n#\n# Licensed under the Apache License, Version 2.0 (the \"License\");\n# you may not use this file except in compliance with the License.\n# You may obtain a copy of the License at\n#\n# http://www.apache.org/licenses/LICENSE-2.0\n#\n# Unless required by applicable law or agreed to in writing, software\n# distributed under the License is distributed on an \"AS IS\" BASIS,\n# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n# See the License for the specific language governing permissions and\n# limitations under the License.\n\"\"\"Visualization tooling.\"\"\"\n
The lines highlighted with green color are recommended by the NiPreps Developers.
Although it is not mandated by the license letter, the spirit of the Apache-2.0 (and all other licenses stipulating the statement of changes, such as the CC-BY 4.0) suggests that a date of modification and an overview of outstanding changes are pertinent. We also suggest a link to the original code, including the commit-hash (that long string starting with 50393a in the URL above) for the location of the exact origin of the file. Alternatively, Derived Works may point to a exact release identifier where the original file is part of the code-base distribution. Please make sure to remove or replace with appropriate contents the comment tags <...> above.
What if a Derived Work does not modify this particular file? You should retain the original attribution notice as is (or introduce it if missing), unless you are relicensing the file. In that case, proceed with the suggestions above, and note the license change in the STATEMENT OF CHANGES block of the header comment.
"},{"location":"community/licensing/#are-papers-using-apache-20-licensed-software-considered-as-derived-works","title":"Are papers using Apache-2.0 licensed software considered as Derived Works?","text":"
No, they don't because they only reuse the software (in other words, they don't redistribute the software). The license stipulates that redistribution must retain the license and attribution notices as they are. In the scientific context, it is likely that a particular tool is modified (for example, to replace a method that you think is not appropriate for your data). Then, redistribution of the source would be desirable from the transparent reporting point of view, and therefore you should honor the License.
Generally, works using our NiPreps just need to follow the citation guidelines of the particular project and report the citation boilerplate including all software versions and literature references in the closest letter possible to that generated by the tool.
"},{"location":"community/licensing/#licensing-of-docker-and-singularity-images","title":"Licensing of Docker and Singularity images","text":"
Container images redistribute copies of NiPreps alongside their third-party dependencies, all of them bundled in the image. If the applicable license is Apache-2.0, then the text of a NOTICE file must be shown to the user. All NiPreps must insert a NOTICE file into their containerized distributions and print its contents out in the command line output, as well as in the visual reports. This NOTICE file for containers will be placed in the /.docker/NOTICE path of the repository, and this file must replace the /NOTICE file (if it exists) at image building time. Alternatively, and if the corresponding NiPreps Developers consider that the Apache-2.0 imposes too onerous requirements for the container image distribution, the source code of such images (e.g., Dockerfile) can be licensed under the MIT license.
Example NOTICE file for fMRIPrep
Python distribution /NOTICEContainer image distribution /.docker/NOTICE
fMRIPrep\nCopyright 2021 The NiPreps Developers.\n\nThis product includes software developed by\nthe NiPreps Community (https://nipreps.org/).\n\nPortions of this software were developed at the Department of\nPsychology at Stanford University, Stanford, CA, US.\n\nThis software contains code ultimately derived from the epidewarp.fsl\nscript (https://www.nmr.mgh.harvard.edu/~greve/fbirn/b0/epidewarp.fsl)\nby Doug Greve, Dave Tuch, Tom Liu, and Bryon Mueller with generous\nhelp from the FSL crew (www.fmrib.ox.ac.uk/fsl) and the Biomedical\nInformatics Research Network (www.nbirn.net).\n
fMRIPrep Container Image distribution\nCopyright 2021 The NiPreps Developers.\n\nThis product includes fMRIPrep and software developed by\nthe NiPreps Community (https://nipreps.org/).\n\nPortions of this software were developed at the Department of\nPsychology at Stanford University, Stanford, CA, US.\n\nThis product bundles AFNI <version-placeholder>, which is available under\nthe Gnu General Public License.\nMajor portions of AFNI were written at the Medical College of Wisconsin,\nwhich owns the copyright to that code. For fuller details, see\nhttp://afni.nimh.nih.gov/pub/dist/src/README.copyright.\n\nThis product bundles ANTs <version-placeholder>, which is available under\nthe BSD 3-clause license terms.\nCopyright 2009-2013 ConsortiumOfANTS.\n\nThis product bundles BIDS-Validator <version-placeholder>, which is available\nunder the MIT License.\nCopyright 2015 The Board of Trustees of the Leland Stanford Junior University.\n\nThis product bundles the Connectome Workbench <version-placeholder>, which\nis available under the GPL-v2\n(https://www.humanconnectome.org/software/connectome-workbench-license).\n\nThis product bundles FSL <version-placeholder>, which is available\nunder a custom license with commercial restrictions\n(https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/Licence).\nCopyright 2018, The University of Oxford.\n\nThis product bundles FreeSurfer <version-placeholder>, which is available\nunder a custom license and requires obtaining a license key\n(https://surfer.nmr.mgh.harvard.edu/fswiki/FreeSurferSoftwareLicense).\nCopyright 2011, The General Hospital Corporation, Boston MA, USA.\n\nThis product bundles code derived from ICA-AROMA, both (fork and original work)\nare available under the Apache-2.0 license.\n(https://github.com/oesteban/ICA-AROMA/blob/master/license.md)\nCopyright 2021, Maarten Mennes\n\nThis product bundles Miniconda <version-placeholder>, which is available\nunder a BSD 3-clause license.\n(c) 2017 Continuum Analytics, Inc. (dba Anaconda, Inc.).\nhttps://www.anaconda.com. All Rights Reserved\n\nThis product bundles NeuroDebian, which adheres to the\nDebian Free Software Guidelines (DFSG)\nhttps://www.debian.org/social_contract#guidelines\nand the terms of the Debian Social Contract version 1.1.\n\nThis product bundles tools by the NiPy community, such as NiBabel\n(MIT License, https://github.com/nipy/nibabel/blob/master/COPYING),\nand NiPype (Apache-2.0, https://github.com/nipy/nipype/blob/master/LICENSE).\n\nThis product bundles Pandoc <version-placeholder>, which is available\nunder the GPL version 2 or later.\nCopyright (C) 2006-2021 John MacFarlane <jgm at berkeley dot edu>\n\nThis product bundles SVGO <version-placeholder>, which is available\nunder the MIT License.\nCopyright (c) Kir Belevich\n\nThis product bundles tedana <version-placeholder>, which is available under\nthe GNU Lesser General Public License v2.1.\nCopyright 2018, tedana developers.\n\nTemplateFlow, a component of this bundle, contains neuroimaging template\nand atlas data under several permissive licenses.\nPlease refer to the metadata of the particular template used in your study to\ndetermine the exact terms of the license and how to acknowledge attribution\nof those works.\n\nsMRIPrep, a component of this bundle, contains code ultimately derived from\nANTs <version-placeholder>, which is available under\nthe BSD 3-clause license terms.\nCopyright 2009-2013 ConsortiumOfANTS.\n\nsMRIPrep, a component of this bundle, contains code ultimately derived from\nMindboggle <version-placeholder>, which is available under\nthe Apache License 2.0.\nCopyright 2016, Mindboggle team (http://mindboggle.info)\n\nfMRIPrep contains code ultimately derived from the epidewarp.fsl\nscript (https://www.nmr.mgh.harvard.edu/~greve/fbirn/b0/epidewarp.fsl)\nby Doug Greve, Dave Tuch, Tom Liu, and Bryon Mueller with generous\nhelp from the FSL crew (www.fmrib.ox.ac.uk/fsl) and the Biomedical\nInformatics Research Network (www.nbirn.net).\n
In general, NiPreps embrace a liberal contribution model of governance structure. However, because of the scientific domain of NiPreps, the community features some structure from meritocracy models to prescribe the order in the authors list of new papers about these tools.
Developers are members of a wonderful team driving the project. Names and contacts of all developers are included in the .maint/developers.json file of each project. Examples of steering activities that drive the project are: actively participating in the follow-up meetings, leading documentation sprints, helping in the design of the tool and definition of the roadmap, providing resources (in the broad sense, including funding), code-review, etc.
Contributors enlisted in the .maint/contributors.json file of each project actively help or have previously helped the project in a broad sense: writing code, writing documentation, benchmarking modules of the tool, proposing new features, helping improve the scientific rigor of implementations, giving out support on the different communication channels (mattermost, NeuroStars, GitHub, etc.). If you are new to the project, don't forget to add your name and affiliation to the list of contributors there! Our Welcome Bot will send an automated message reminding this to first-time contributors. Before every release, unlisted contributors will be invited again to add their names to the file (just in case they missed the automated message from our Welcome Bot).
Contributors who have contributed at some point to the project but were required or they wished to disconnect from the project's updates and to drop-out from publications and other dissemination activities, are listed in the .maint/former.json file.
This document explains how to prepare a new development environment and update an existing environment, as necessary, for the development of NiPreps' components. Some components may deviate from these guidelines, in such a case, please follow the guidelines provided in their documentation.
If you plan to contribute back to the community, making your code available via pull-request, please make sure to have read and understood the Community Documents and Contributor Guidelines. If you plan to distribute derived code, please follow our licensing guidelines.
Development in Docker is encouraged, for the sake of consistency and portability. By default, work should be built off of nipreps/fmriprep:unstable, which tracks the master branch, or nipreps/fmriprep:latest, which tracks the latest release version (see BIDS-Apps execution guide for the basic procedure for running).
It will be assumed the developer has a working repository in $HOME/projects/fmriprep, and examples are also given for niworkflows and NiPype.
"},{"location":"devs/devenv/#patching-a-working-copy-into-a-docker-container","title":"Patching a working copy into a Docker container","text":"
In order to test new code without rebuilding the Docker image, it is possible to mount working repositories as source directories within the container. The Docker wrapper script simplifies this for the most common repositories:
-f PATH, --patch-fmriprep PATH\n working fmriprep repository (default: None)\n -n PATH, --patch-niworkflows PATH\n working niworkflows repository (default: None)\n -p PATH, --patch-nipype PATH\n working nipype repository (default: None)\n
For instance, if your repositories are contained in $HOME/projects:
New dependencies to be inserted into the Docker image will either be Python or non-Python dependencies. Python dependencies may be added in three places, depending on whether the package is large or non-release versions are required. The image must be rebuilt after any dependency changes.
Python dependencies should generally be included in the appropriate dependency metadata of the setup.cfg file found at the root of each repository. If some the dependency must be a particular version (or set thereof), it is possible to use version filters in this setup.cfg file.
For large Python dependencies where there will be a benefit to pre-compiled binaries, conda packages may also be added to the conda install line in the Dockerfile.
Non-Python dependencies must also be installed in the Dockerfile, via a RUN command. For example, installing an apt package may be done as follows:
RUN apt-get update && \\\n apt-get install -y <PACKAGE>\n
If it is necessary to (re)build the Docker image, a local image named fmriprep may be built from within the local repository. Let's assume it is located in ~/projects/fmriprep:
The VERSION build argument is necessary to ensure that help text can be reliably generated. The get_version.py tool constructs the version string from the current repository state.
To work in this image, replace nipreps/fmriprep:latest with just fmriprep in any of the above commands. This image may be accessed by the Docker wrapper via the -i flag, e.g.:
$ fmriprep-docker -i fmriprep --shell\n
"},{"location":"devs/devenv/#code-server-development-environment-experimental","title":"Code-Server Development Environment (Experimental)","text":"
To get the best of working with containers and having an interactive development environment, we have an experimental setup with code-server.
Important
We have a video walking through the process if you want a visual guide.
1. Build the Docker image. We will use the Dockerfile_devel file to build our development docker image:
$ cd $HOME/projects/fmriprep\n$ docker build -t fmriprep_devel -f Dockerfile_devel .\n
2. Run the Docker image We can start a docker container using the image we built (fmriprep_devel):
$ docker run -it -p 127.0.0.1:8445:8080 -v ${PWD}:/src/fmriprep fmriprep_devel:latest\n
Windows Users
If you are using windows shell, ${PWD} may not be defined, instead use the absolute path to your repository.
Docker-Toolbox
If you are using Docker-Toolbox, you will need to change your virtualbox settings using these steps as a guide. For step 6, instead of Name = rstudio; Host Port = 8787; Guest Port = 8787, have Name = code-server; Host Port = 8443; Guest Port = 8080. Then in the docker command above, change 127.0.0.1:8445:8080 to 192.168.99.100:8445:8080.
If the container started correctly, you should see the following on your console:
INFO Server listening on http://localhost:8080\nINFO - No authentication\nINFO - Not serving HTTPS\n
Now you can switch to your favorite browser and go to: 127.0.0.1:8445 (or 192.168.99.100:8445 for Docker Toolbox).
3. Copy fmriprep.egg-info into your fmriprep/ project directory fmriprep.egg-info makes the package executable inside the docker container. Open a terminal in vscode and type the following:
$ cp -R /src/fmriprep.egg-info /src/fmriprep/\n
"},{"location":"devs/devenv/#code-server-development-environment-features","title":"Code-Server Development Environment Features","text":"
The editor is vscode
There are several preconfigured debugging tests under the debugging icon in the activity bar
see vscode debugging python for details.
The gitlens and python extensions are preinstalled to improve the development experience in vscode.
As of January 2020, fMRIPrep has adopted a Calendar Versioning scheme, and with it we are attempting to apply more coherent semantic rules to our releases.
Note
This document is a draft for internal and external comment. Any commitments expressed here are proposals, and should not be relied upon at this time. This conversation started as a Google Doc.
The basic release form is YY.MINOR.PATCH, so the first minor release of 2020 is 20.0.0, and the first minor release of 2021 will be 21.0.0, whatever the final minor release of 2020 is. A series of releases share a YY.MINOR. prefix, which we refer to as the YY.MINOR.x series. For example, the 20.0.x series contains version 20.0.0, 20.0.1, and any other releases needed.
Minor releases are considered feature releases. Because there is no concept of a \"major\" release (just a calendar year rollover), most changes to the code base will result in a new feature release. Changes targeting a new feature release should target the master branch. Feature releases may be released as often as is deemed appropriate.
Patch releases are considered bug-fix releases. Each minor release triggers the creation of a new maint/<YY>.<MINOR>.x branch, and changes targeting a bug-fix release should target this branch. A \"minor release series\" is the initial feature release and the bug-fix releases that share the minor release prefix. Bug-fix releases may be released on minimal notice to other developers.
These releases must satisfy four conditions:
Resolving one or more bugs. These mostly include failures of fMRIPrep to complete or producing invalid derivatives (e.g., a NIfTI file of all zeroes).
Derivatives compatibility. If a subject may be successfully run on 20.0.n, then the imaging derivatives should be identical if rerun with 20.0.(n+1), modulo rounding errors and the effects of nondeterministic algorithms. The changes between successful runs of 20.0.n and 20.0.(n+1) should not be larger than the changes between two successful runs of 20.0.n. Cosmetic changes to reports are acceptable, while differing fields of view or data types in a NIfTI file would not be.
API compatibility. Workflow-generating functions, workflow inputnode and outputnode fields must not change. As an end-user application, this may seem overly strict, but the odds of introducing a bug are much higher in these cases.
User interface compatibility. Substantial changes to fMRIPrep command line must not happen (e.g., the addition of a new, relevant flag).
Note that not all bugs can be fixed in a way that satisfies all four of these criteria without significant effort. A developer may determine that the bug will be fixed in the next feature release.
Additional acceptable changes within a minor release series:
Improved tests. These often come along with bug fixes, but they can be free-standing improvements to the code base.
Improved documentation. Unless the documentation is of a feature that will not be present in a bug-fix release, this is always welcome.
Updates to the Dockerfile that improve operation for Docker and/or Singularity users, but do not risk behavior change. A good example is including more templates to reduce the need for network requests. An example of an update to the Dockerfile that forces a minor release increment is a change in the pinned version of any of the dependencies or the base container image.
Improvements to the lightweight wrappers. As long as a command-line invocation that worked for the previous version continues to work and produce the same Docker command, there's little chance of harm.
It is expected that maint/20.0.x will diverge from master, as new features will be merged into master, and bug-fixes into maint/20.0.x. At a minimum, each new bug-fix release should be merged into master. After a 20.0.1 release:
fMRIPrep has a number of dependencies that we control at this point:
sMRIPrep
SDCflows
NiWorkflows
These do not follow the same versioning scheme as above, but we need them to follow a compatible scheme. In particular, we need to be able to fix bugs that are situated within these dependencies in a bug-fix release without violating the criteria laid out above. At the time of an fMRIPrep feature release, all of the above tools need to also split out a maintenance branch (if they have not already) for the minor version series that fMRIPrep depends on. As an example, when 20.0.0 was released, fMRIPrep had the following dependencies in setup.cfg:
~= is the compatible release specifier described in PEP 440. ~= 1.1.7 is equivalent to >= 1.1.7, == 1.1.*. This means that the current version of fMRIPrep is expected to work with niworkflows 1.1.7+ but not 1.2+. Thus, niworkflows needs to have a maint/1.1.x branch, sdcflows a maint/1.2.x and smriprep maint/0.5.x. Any changes to these tools that might violate API or derivative compatibility, must go into master, and must not be released into the current minor series of these tools. Note that fMRIPrep 20.0.0 does not depend on niworkflows ~= 1.1.0. Multiple feature releases of fMRIPrep may depend on the same minor release series of a dependency. There is no requirement to hike the dependency. However, if a dependency has started a new minor release series, a feature release of fMRIPrep is a good opportunity to bump the dependency.
We maintain a Versions Matrix to document and keep track of these dependencies.
A minor release series will continue to accept qualifying bug fixes at least until the next minor release. A minimum duration may be considered, or a fixed number of minor release series might be simultaneously supported.
An unmaintained series is a valid target for bug fixes after the support window, but the expected effort level of the contributor and maintainers will be higher and lower, respectively.
"},{"location":"devs/releases/#long-term-support-series","title":"Long-term support series","text":"
A long-term support (LTS) series is a minor release series that an LTS manager commits to maintaining for a specific duration, no less than one year. LTS series are under the same constraints as a minor release series in terms of what changes can be accepted.
The fMRIPrep developers commit to maintaining one LTS series at all times, at intervals of approximately one year. Community members may volunteer to assume maintainership after the initial period, or to maintain another minor release series as LTS.
Support windows of greater than a year have a much higher potential to run into issues with upstream dependencies going outside of their support windows. As much as possible, an fMRIPrep minor release should seek to move to the versions of upstream dependencies that will ensure the longest support before being considered for LTS.
Additional tasks required of an LTS manager:
Tracking possible breaking changes and broken URLs in upstream projects outside of the nipreps ecosystem.
If a bug is identified as existing within the LTS series and can be fixed without breaking API or derivative compatibility.
As many dependencies as possible should be pinned to specific versions relevant to the environment they are installed in. Packages (Debian .deb files, conda packages, Python wheels) should be archived in case of a loss of the external packages.
sMRIPrep requires niworkflows and generally must depend on one minor series of niworkflows for the duration of an sMRIPrep minor series. Each sMRIPrep series may also be depended on for an fMRIPrep series and/or a dMRIPrep series. Noting these dependencies here should make it easier to track when a new minor series needs to be created.
"},{"location":"intro/nipreps/","title":"Framework","text":""},{"location":"intro/nipreps/#building-on-fmripreps-success-story","title":"Building on fMRIPrep's success story","text":"
The current neuroimaging workflow has matured into a large chain of processing and analysis steps involving a large number of experts, across imaging modalities and applications. The development and fast adoption of fMRIPrep have revealed that neuroscientists need tools that simplify their research workflow, provide visual reports and checkpoints, and engender trust in the tool itself. The NiPreps framework extends fMRIPrep's approach and principles to new imaging modalities. The vision for NiPreps is to provide end-users (i.e., researchers) with applications that allow them to perform quality control smoothly and to prepare their data for modeling and statistical analysis.
NiPreps leverage the Brain Imaging Data Structure (BIDS) to understand all the particular features and available metadata (i.e., imaging parameters) of the input dataset. BIDS allows NiPreps to automatically stage the most adequate preprocessing workflow while minimizing manual intervention.
The NiPreps framework (Figure 1) encompasses a wide array of software projects organized into three layers of scientific software:
Software infrastructure: including quite mature projects such as NiPype and NiBabel; the standard specifications of the Brain Imaging Data Structure (BIDS, and BIDS-Derivatives); and some other tools such as NiTransforms or TemplateFlow, under development. These tools deliver low-level interfaces (e.g., data access to images and spatial transforms) and utilities (see Figure 1).
Middleware: these are utilities that generalize their functionalities across the end-user tools. These utilities cover foundational processing methodologies (e.g., NiWorkflows and SDCflows), the crowdsourcing of metadata (e.g., MRIQC Web-API), and the support for deep learning models (MRIQC-nets).
End-user tools such as fMRIPrep: Some existing end-user tools include sMRIPrep (Structural MRI Preprocessing), which lies in between an end-user tool and middleware, as it is involved in higher-level tools such as fMRIPrep. Finally, quality control tools (e.g., MRIQC) to be executed before any preprocessing happens.
NiRodents (GitHub): middleware adaptations for small animals imaging.
NiBabies (GitHub): middleware adaptations for infant imaging.
"},{"location":"intro/transparency/","title":"Transparency of workflows","text":"
NiPreps adopt fMRIPrep's foundations, and particularly resonate with the transparency principles. As discussed in (Esteban et al., 2019 -- preprint):
The rapid increase in the volume and diversity of data, as well as the evolution of available techniques for processing and analysis, presents an opportunity for considerable advancement of research in neuroscience. The drawback resides in the need for progressively more complex analysis workflows that rely on decreasingly interpretable models of the data. Such context encourages \u2018black-box\u2019 solutions that efficiently perform a valuable service but do not provide insights into how the tool has transformed the data into the expected outputs. Black boxes obscure important steps in the inductive process mediating between experimental measurements and reported findings. This way of moving forward risks producing a future generation of cognitive neuroscientists who have become experts in sophisticated computational methods but have little to no working knowledge of how their data were transformed through processing. Transparency is often identified as a remedy for these problems. fMRIPrep ascribes to \u2018glass-box\u2019 principles, which are defined in opposition to the many different facets or levels at which black-box solutions are opaque. The visual reports that fMRIPrep generates are a crucial aspect of the glass-box approach. Their quality control checkpoints represent the logical flow of preprocessing, allowing scientists to critically inspect and better understand the underlying mechanisms of the workflow. A second transparency element is the citation boilerplate that formalizes all details of the workflow and provides the versions of all involved tools along with references to the corresponding scientific literature. A third asset for transparency is thorough documentation that delivers additional details on each of the building blocks represented in the visual reports and described in the boilerplate. Further, fMRIPrep has been open-source since its inception: users have access to all of the incremental additions to the tool through the history of the version-control system. The use of GitHub grants access to the discussions held during development, allowing one to see how and why the main design decisions were made. The modular design of fMRIPrep enhances its flexibility and improves transparency, as the main features of the software are more easily accessible to potential collaborators. In combination with some coding style and contribution guidelines, this modularity has enabled multiple contributions by peers and the creation of a rapidly growing community that would be difficult to nurture behind closed doors.
One foundational component of the NiPreps framework is the Visual Report System. End-user applications such as fMRIPrep or dMRIPrep generate individual reports after their preprocessing. Those visual reports have two fundamental purposes:
assessing the quality of the generated outputs, permitting the user to take quality control actions to eliminate biases originated from inadequate processing; and
understanding the workflow, by sequentially presenting the main steps of processing, the user can access the why the tool in particular took these steps ando more geneally why standard preprocessing involves that step.
NiPreps leverage the wealth of existing neuroimaging software that is available to researchers. To give back for standing on the shoulders of giants, NiPreps aim at the most thorough reporting possible crediting all the pieces of the prior knowledge they leverage. With the execution of some particular NiPreps, the application runs some introspection code to formalize the computational graph the particular workflow executed and iterates over all the nodes to extract the relevant articles and communications that should be cited, as well as all software tools and their versions involved. Similarly, ancillary materials such as neuroimaging templates and atlases are reported and cited.
All these references and citations are finally collated in a natural language description of the workflow. This description is therefore generated automatically, and contains all the details that are necessary to replicate the processing, as well as the abovementioned references. The text is appended to the visual report, and provided in three formats (markdown, latex and html/plain-text) with an index of citations, so that the user is only required to \"copy-and-paste\" into the Methods section of their papers.
Note for reviewers and editors
The boilerplate text generated by some NiPreps is intended to allow for clear, consistent description of the preprocessing steps used, in order to improve the reproducibility of studies. We fully intend for it to be copied verbatim, and have released it under the CC0 license, dedicating it to the public domain in jurisdictions that recognize the concept, and assert that we will take no action to enforce copyright in jurisdictions where we cannot disclaim it.
We firmly believe that requiring authors to modify this passage will serve no legitimate scientific or literary purpose and can, in fact, serve only to reduce the replicability of the analysis being described by making the preprocessing steps less clear.
We recognize that there may be automated plagiarism detection software that will flag the boilerplate text. We would be happy to discuss potential solutions for annotating boilerplate sections of documents to indicate automatic generation, and can update our software to make this annotation simpler for authors.
"},{"location":"news/","title":"News and Announcements","text":""},{"location":"news/#register-for-the-nipreps-hackathon-with-the-ohbm23-brainhack","title":"Register for the NiPreps hackathon with the OHBM'23 Brainhack!","text":"
We are thrilled to announce that the NiPreps Hackathon's second edition will be part of the upcoming OHBM'23 Brainhack (July 19-21, Maison Notman House, Montreal, Canada).
Registration To join us for this incredible event and work on NiPreps-related projects, please fill in our registration form.
Please remember to also register on the official webpage of the OHBM Brainhack. You will find all the necessary information, event schedule, and location details on Brainhack's website.
Approach and projects We will advance (online) some projects as much as possible before the BrainHack. We are putting together a list of potential projects at https://github.com/orgs/nipreps/projects/8. Please feel free to let us know your ideas and voice your questions. Projects can start at any moment (even at the venue in Montreal) to have the flexibility to accommodate all ideas.
Those projects with preliminary work will have project leaders who will organize meetings, coordinate a roadmap and help carry out the necessary tasks.
See you in Montreal!
"},{"location":"news/#nipreps-roundups-feb-22-2023","title":"NiPreps Roundups Feb 22, 2023","text":"
We resumed the bi-monthly NiPreps Roundups with a first meeting on February 22, 2023.