diff --git a/LICENSE b/LICENSE index 00e7481a..34648045 100644 --- a/LICENSE +++ b/LICENSE @@ -1,4 +1,4 @@ -Copyright [2014-2019] [Heudiconv developers] +Copyright [2014-2024] [HeuDiConv developers] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. @@ -11,3 +11,10 @@ Copyright [2014-2019] [Heudiconv developers] WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. + + +Some parts of the codebase/documentation are borrowed from other sources: + +- HeuDiConv tutorial from https://bitbucket.org/dpat/neuroimaging_core_docs/src + + Copyright 2023 Dianne Patterson diff --git a/README.rst b/README.rst index 7cb21498..a177af8f 100644 --- a/README.rst +++ b/README.rst @@ -115,3 +115,17 @@ Docker image preparation being found in ``.github/workflows/release.yml``. --------------------- - https://github.com/courtois-neuromod/ds_prep/blob/main/mri/convert/heuristics_unf.py + + +Support +------- + +All bugs, concerns and enhancement requests for this software can be submitted here: +https://github.com/nipy/heudiconv/issues. + +If you have a problem or would like to ask a question about how to use ``heudiconv``, +please submit a question to `NeuroStars.org `_ with a ``heudiconv`` tag. +NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics. + +All previous ``heudiconv`` questions are available here: +http://neurostars.org/tags/heudiconv/ diff --git a/docs/commandline.rst b/docs/commandline.rst new file mode 100644 index 00000000..c44ff7b9 --- /dev/null +++ b/docs/commandline.rst @@ -0,0 +1,12 @@ +============= +CLI Reference +============= + +``heudiconv`` processes DICOM files and converts the output into user defined +paths. + +.. argparse:: + :ref: heudiconv.cli.run.get_parser + :prog: heudiconv + :nodefault: + :nodefaultconst: diff --git a/docs/container.rst b/docs/container.rst new file mode 100644 index 00000000..8ad96729 --- /dev/null +++ b/docs/container.rst @@ -0,0 +1,49 @@ +============================== +Using heudiconv in a Container +============================== + +If heudiconv is :ref:`installed via a Docker container `, you +can run the commands in the following format:: + + docker run nipy/heudiconv:latest [heudiconv options] + +So a user running via container would check the version with this command:: + + docker run nipy/heudiconv:latest --version + +Which is equivalent to the locally installed command:: + + heudiconv --version + +Bind mount +---------- + +Typically, users of heudiconv will be operating on data that is on their local machine. We can give heudiconv access to that data via a ``bind mount``, which is the ``-v`` syntax. + +Once common pattern is to share the working directory with ``-v $PWD:$PWD``, so heudiconv will behave as though it is installed on your system. However, you should be aware of how permissions work depending on your container toolset. + + +Docker Permissions +****************** + +When you run a container with docker without specifying a user, it will be run as root. +This isn't ideal if you are operating on data owned by your local user, so for ``Docker`` it is recommended to specify that the container will run as your user.:: + + docker run --user=$(id -u):$(id -g) -e "UID=$(id -u)" -e "GID=$(id -g)" --rm -t -v $PWD:$PWD nipy/heudiconv:latest --version + +Podman Permissions +****************** + +When running Podman without specifying a user, the container is run as root inside the container, but your user outside of the container. +This default behavior usually works for heudiconv users:: + + docker run -v $PWD:PWD nipy/heudiconv:latest --version + +Other Common Options +-------------------- + +We typically recommend users make use of the following flags to Docker and Podman + +* ``-it`` Interactive terminal +* ``--rm`` Remove the changes to the container when it completes + diff --git a/docs/custom-heuristic.rst b/docs/custom-heuristic.rst new file mode 100644 index 00000000..bfef79db --- /dev/null +++ b/docs/custom-heuristic.rst @@ -0,0 +1,319 @@ +========================= +Custom Heuristics +========================= + +This tutorial is based on `Dianne Patterson's University of Arizona tutorials `_ + + +In this tutorial we go more in depth, creating our own *heuristic.py* and modifying it for our needs: + +1. :ref:`Step1 ` Generate a heuristic (translation) file skeleton and some associated descriptor text files. +2. :ref:`Step2 ` Modify the *heuristic.py* to specify BIDS output names and directories, and the input DICOM characteristics. +3. :ref:`Step3 ` Call HeuDiConv to run on more subjects and sessions. + +**Prerequisites**: + +1. Ensure :ref:`heudiconv and dcm2niix ` is installed. +2. :ref:`Prepare the dataset ` used in the quickstart. + +.. _heudiconv_step1: + +Step 1: Generate Skeleton +************************* + +.. note:: Step 1 only needs to be completed once for each project. + If repeating this step, ensure that the .heudiconv directory is removed. + +From the *MRIS* directory, run the following command to process the ``dcm`` files that you downloaded and unzipped for this tutorial.:: + + heudiconv --files dicom/219/*/*/*.dcm -o Nifti/ -f convertall -s 219 -c none + +* ``--files dicom/{subject}/*/*/*.dcm`` identifies the path to the DICOM files and specifies that they have the extension ``.dcm`` in this case. +* ``-o Nifti/`` is the output in *Nifti*. If the output directory does not exist, it will be created. +* ``-f convertall`` This creates a *heuristic.py* template from an existing heuristic module. There are `other heuristic modules `_ , but *convertall* is a good default. +* ``-s 219`` specifies the subject number. +* ``-c none`` indicates you are not actually doing any conversion right now. + +You will now have a heudiconv skeleton in the `/.heudiconv` directory, in our case `Nifti/.heudiconv` + +The ``.heudiconv`` hidden directory +====================================== + +Take a look at *MRIS/Nifti/.heudiconv/219/info/*, heudiconv has produced two files of interest: a skeleton *heuristic.py* and a *dicominfo.tsv* file. +The generated heuristic file template contains comments explaining usage. + +.. warning:: + * **The Good** Every time you run conversion to create the BIDS NIfTI files and directories, a detailed record of what you did is recorded in the *.heudiconv* directory. This includes a copy of the *heuristic.py* module that you ran for each subject and session. Keep in mind that the hidden *.heudiconv* directory gets updated every time you run heudiconv. Together your *code* and *.heudiconv* directories provide valuable provenance information that should remain with your data. + * **The Bad** If you rerun *heuristic.py* for some subject and session that has already been run, heudiconv quietly uses the conversion routines it stored in *.heudiconv*. This can be really annoying if you are troubleshooting *heuristic.py*. + * **More Good** You can remove subject and session information from *.heudiconv* and run it fresh. In fact, you can entirely remove the *.heudiconv* directory and still run the *heuristic.py* you put in the *code* directory. + + +.. _heudiconv_step2: + +Step 2: Modify Heuristic +************************ + +.. TODO Lets remove heuristic1 and heuristic2 and create a 2nd example + dataset? or branch? + +We will modify the generated *heuristic.py* so heudiconv will arrange the output in a BIDS directory structure. + +It is okay to rename this file, or to have several versions with different names, just be sure to pass the intended filename with `-f`. See :doc:`heuristics` docs for more info. + +* I provide three section labels (1, 1b and 2) to facilitate exposition here. Each of these sections should be manually modified by you for your project. + +Section 1 +============== + +* This *heuristic.py* does not import all sequences in the example *Dicom* directory. This is a feature of heudiconv: You do not need to import scouts, motion corrected images or other DICOMs of no interest. +* You may wish to add, modify or remove keys from this section for your own data:: + + # Section 1: These key definitions should be revised by the user + ################################################################### + # For each sequence, define a key variables (e.g., t1w, dwi etc) and template using the create_key function: + # key = create_key(output_directory_path_and_name). + + ###### TIPS ####### + # If there are sessions, then session must be subfolder name. + # Do not prepend the ses key to the session! It will be prepended automatically for the subfolder and the filename. + # The final value in the filename should be the modality. It does not have a key, just a value. + # Otherwise, there is a key for every value. + # Filenames always start with subject, optionally followed by session, and end with modality. + + ###### Definitions ####### + # The "data" key creates sequential numbers which can be used for naming sequences. + # This is especially valuable if you run the same sequence multiple times at the scanner. + data = create_key('run-{item:03d}') + + t1w = create_key('sub-{subject}/{session}/anat/sub-{subject}_{session}_T1w') + + dwi = create_key('sub-{subject}/{session}/dwi/sub-{subject}_{session}_dir-AP_dwi') + + # Save the RPE (reverse phase-encode) B0 image as a fieldmap (fmap). It will be used to correct + # the distortion in the DWI + fmap_rev_phase = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_dir-PA_epi') + + fmap_mag = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_magnitude') + + fmap_phase = create_key('sub-{subject}/{session}/fmap/sub-{subject}_{session}_phasediff') + + # Even if this is resting state, you still need a task key + func_rest = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_run-01_bold') + func_rest_post = create_key('sub-{subject}/{session}/func/sub-{subject}_{session}_task-rest_run-02_bold') + +* **Key** + + * Define a short informative key variable name for each image sequence you wish to export. Note that you can use any key names you want (e.g. *foo* would work as well as *fmap_phase*), but you need to be consistent. + * The ``key`` name is to the left of the ``=`` for each row in the above example. +* **Template** + + * Use the variable ``{subject}`` to make the code general purpose, so you can apply it to different subjects in Step 3. + * Use the variable ``{session}`` to make the code general purpose only if you have multiple sessions for each subject. + + * Once you use the variable ``{session}``: + * Ensure that a session gets added to the **output path**, e.g., ``sub-{subject}/{session}/anat/`` AND + * Session gets added to the **output filename**: ``sub-{subject}_{session}_T1w`` for every image in the session. + * Otherwise you will get `bids validator errors `_ + + * Define the output directories and file names according to the `BIDS specification `_ + * Note the output names for the fieldmap images (e.g., *sub-219_ses-itbs_dir-PA_epi.nii.gz*, *sub-219_ses-itbs_magnitude1.nii.gz*, *sub-219_ses-itbs_magnitude2.nii.gz*, *sub-219_ses-itbs_phasediff.nii.gz*). + * The reverse_phase encode dwi image (e.g., *sub-219_ses-itbs_dir-PA_epi.nii.gz*) is grouped with the fieldmaps because it is used to correct other images. + * Data that is not yet defined in the BIDS specification will cause the bids-validator to produce an error unless you include it in a `.bidsignore `_ file. + +* **data** + + * a key definition that creates sequential numbering + * ``03d`` means *create three slots for digits* ``3d``, *and pad with zeros* ``0``. + * This is useful if you have a scanner sequence with a single name but you run it repeatedly and need to generate separate files for each run. For example, you might define a single functional sequence at the scanner and then run it several times instead of creating separate names for each run. + + .. Note:: It is usually better to name your sequences explicitly (e.g., run-01, run-02 etc.) rather than depending on sequential numbering. There will be less confusion later. + + * If you have a sequence with the same name that you run repeatedly WITHOUT the sequential numbering, HeuDiConv will overwrite earlier sequences with later ones. + * To ensure that a sequence includes sequential numbering, you also need to add ``run-{item:03d}`` (for example) to the key-value specification for that sequence. + * Here I illustrate with the t1w key-value pair: + + * If you started with: + + * ``t1w = create_key('sub-{subject}/anat/sub-{subject}_T1w')``, + * You could add sequence numbering like this: + + * ``t1w = create_key('sub-{subject}/anat/sub-{subject}_run-{item:03d}_T1w')``. + * Now if you export several T1w images for the same subject and session, using the exact same protocol, each will get a separate run number like this: + + * *sub-219_ses_run-001_T1w.nii.gz, sub-219_ses_run-002_T1w.nii.gz* etc. + +Section 1b +==================== + +* Based on your chosen keys, create a data dictionary called *info*:: + + # Section 1b: This data dictionary (below) should be revised by the user. + ########################################################################### + # info is a Python dictionary containing the following keys from the infotodict defined above. + # This list should contain all and only the sequences you want to export from the dicom directory. + info = {t1w: [], dwi: [], fmap_rev_phase: [], fmap_mag: [], fmap_phase: [], func_rest: [], func_rest_post: []} + + # The following line does no harm, but it is not part of the dictionary. + last_run = len(seqinfo) + +* Enter each key in the dictionary in this format ``key: []``, for example, ``t1w: []``. +* Separate the entries with commas as illustrated above. + +Section 2 +=============== + +* Define the criteria for identifying each DICOM series that corresponds to one of the keys you want to export:: + + # Section 2: These criteria should be revised by the user. + ########################################################## + # Define test criteria to check that each DICOM sequence is correct + # seqinfo (s) refers to information in dicominfo.tsv. Consult that file for + # available criteria. + # Each sequence to export must have been defined in Section 1 and included in Section 1b. + # The following illustrates the use of multiple criteria: + for idx, s in enumerate(seqinfo): + # Dimension 3 must equal 176 and the string 'mprage' must appear somewhere in the protocol_name + if (s.dim3 == 176) and ('mprage' in s.protocol_name): + info[t1w].append(s.series_id) + + # Dimension 3 must equal 74 and dimension 4 must equal 32, and the string 'DTI' must appear somewhere in the protocol_name + if (s.dim3 == 74) and (s.dim4 == 32) and ('DTI' in s.protocol_name): + info[dwi].append(s.series_id) + + # The string 'verify_P-A' must appear somewhere in the protocol_name + if ('verify_P-A' in s.protocol_name): + info[fmap_rev_phase] = [s.series_id] + + # Dimension 3 must equal 64, and the string 'field_mapping' must appear somewhere in the protocol_name + if (s.dim3 == 64) and ('field_mapping' in s.protocol_name): + info[fmap_mag] = [s.series_id] + + # Dimension 3 must equal 32, and the string 'field_mapping' must appear somewhere in the protocol_name + if (s.dim3 == 32) and ('field_mapping' in s.protocol_name): + info[fmap_phase] = [s.series_id] + + # The string 'resting_state' must appear somewhere in the protocol_name and the Boolean field is_motion_corrected must be False (i.e. not motion corrected) + # This ensures I do NOT get the motion corrected MOCO series instead of the raw series! + if ('restingstate' == s.protocol_name) and (not s.is_motion_corrected): + info[func_rest].append(s.series_id) + + # The string 'Post_TMS_resting_state' must appear somewhere in the protocol_name and the Boolean field is_motion_corrected must be False (i.e. not motion corrected) + + # This ensures I do NOT get the motion corrected MOCO series instead of the raw series. + if ('Post_TMS_restingstate' == s.protocol_name) and (not s.is_motion_corrected): + info[func_rest_post].append(s.series_id) + + * To define the criteria, look at *dicominfo.tsv* in *.heudiconv/info*. This file contains tab-separated values so you can easily view it in Excel or any similar spreadsheet program. *dicominfo.tsv* is not used programmatically to run heudiconv (i.e., you could delete it with no adverse consequences), but it is very useful for defining the test criteria for Section 2 of *heuristic.py*. + * Some values in *dicominfo.tsv* might be wrong. For example, my reverse phase encode sequence with two acquisitions of 74 slices each is reported as one acquisition with 148 slices (2018_12_11). Hopefully they'll fix this. Despite the error in *dicominfo.tsv*, dcm2niix reconstructed the images correctly. + * You will be adding, removing or altering values in conditional statements based on the information you find in *dicominfo.tsv*. + * ``seqinfo`` (s) refers to the same information you can view in *dicominfo.tsv* (although seqinfo does not rely on *dicominfo.tsv*). + * Here are two types of criteria: + + * ``s.dim3 == 176`` is an **equivalence** (e.g., good for checking dimensions for a numerical data type). For our sample T1w image to be exported from DICOM, it must have 176 slices in the third dimension. + * ``'mprage' in s.protocol_name`` says the protocol name string must **include** the word *mprage* for the *T1w* image to be exported from DICOM. This criterion string is case-sensitive. + + * ``info[t1w].append(s.series_id)`` Given that the criteria are satisfied, the series should be named and organized as described in *Section 1* and referenced by the info dictionary. The information about the processing steps is saved in the *.heudiconv* subdirectory. + * Here I have organized each conditional statement so that the sequence protocol name comes first followed by other criteria if relevant. This is not necessary, though it does make the resulting code easier to read. + + +.. _heudiconv_step3: + +Step 3: +******************* + +* You have now done all the hard work for your project. When you want to add a subject or session, you only need to run this third step for that subject or session (A record of each run is kept in .heudiconv for you):: + + heudiconv --files dicom/{subject}/*/*.dcm -o Nifti/ -f Nifti/code/heuristic.py -s 219 -ss itbs -c dcm2niix -b --minmeta --overwrite + +* The first time you run this step, several important text files are generated (e.g., CHANGES, dataset_description.json, participants.tsv, README etc.). + On subsequent runs, information may be added (e.g., *participants.tsv* will be updated). + Other files, like the *README* and *dataset_description.json* should be updated manually. +* This Docker command is slightly different from the previous Docker command you ran. + + * ``-f Nifti/code/heuristic.py`` now tells HeuDiConv to use your revised *heuristic.py* in the *code* directory. + * In this case, we specify the subject we wish to process ``-s 219`` and the name of the session ``-ss itbs``. + * We could specify multiple subjects like this: ``-s 219 220 -ss itbs`` + * ``-c dcm2niix -b`` indicates that we want to use the dcm2niix converter with the -b flag (which creates BIDS). + * ``--minmeta`` ensures that only the minimum necessary amount of data gets added to the JSON file when created. On the off chance that there is a LOT of meta-information in the DICOM header, the JSON file will not get swamped by it. fmriprep and mriqc are very sensitive to this information overload and will crash, so *minmeta* provides a layer of protection against such corruption. + * ``--overwrite`` This is a peculiar option. Without it, I have found the second run of a sequence does not get generated. But with it, everything gets written again (even if it already exists). I don't know if this is my problem or the tool...but for now, I'm using ``--overwrite``. + * Step 3 should produce a tree like this:: + + Nifti + ├── CHANGES + ├── README + ├── code + │   ├── __pycache__ + │   │   └── heuristic1.cpython-36.pyc + │   ├── heuristic1.py + │   └── heuristic2.py + ├── dataset_description.json + ├── participants.json + ├── participants.tsv + ├── sub-219 + │   └── ses-itbs + │   ├── anat + │   │   ├── sub-219_ses-itbs_T1w.json + │   │   └── sub-219_ses-itbs_T1w.nii.gz + │   ├── dwi + │   │   ├── sub-219_ses-itbs_dir-AP_dwi.bval + │   │   ├── sub-219_ses-itbs_dir-AP_dwi.bvec + │   │   ├── sub-219_ses-itbs_dir-AP_dwi.json + │   │   └── sub-219_ses-itbs_dir-AP_dwi.nii.gz + │   ├── fmap + │   │   ├── sub-219_ses-itbs_dir-PA_epi.json + │   │   ├── sub-219_ses-itbs_dir-PA_epi.nii.gz + │   │   ├── sub-219_ses-itbs_magnitude1.json + │   │   ├── sub-219_ses-itbs_magnitude1.nii.gz + │   │   ├── sub-219_ses-itbs_magnitude2.json + │   │   ├── sub-219_ses-itbs_magnitude2.nii.gz + │   │   ├── sub-219_ses-itbs_phasediff.json + │   │   └── sub-219_ses-itbs_phasediff.nii.gz + │   ├── func + │   │   ├── sub-219_ses-itbs_task-rest_run-01_bold.json + │   │   ├── sub-219_ses-itbs_task-rest_run-01_bold.nii.gz + │   │   ├── sub-219_ses-itbs_task-rest_run-01_events.tsv + │   │   ├── sub-219_ses-itbs_task-rest_run-02_bold.json + │   │   ├── sub-219_ses-itbs_task-rest_run-02_bold.nii.gz + │   │   └── sub-219_ses-itbs_task-rest_run-02_events.tsv + │   ├── sub-219_ses-itbs_scans.json + │   └── sub-219_ses-itbs_scans.tsv + └── task-rest_bold.json + +TIPS +====== + +* **Name Directories as you wish**: You can name the project directory (e.g., **MRIS**) and the output directory (e.g., **Nifti**) as you wish (just don't put spaces in the names!). +* **Age and Sex Extraction**: Heudiconv will extract age and sex info from the DICOM header. If there is any reason to believe this information is wrong in the DICOM header (for example, it was made-up because no one knew how old the subject was, or it was considered a privacy concern), then you need to check the output. If you have Horos (or another DICOM editor), you can edit the values in the DICOM headers, otherwise you need to edit the values in the BIDS text file *participants.tsv*. +* **Separating Sessions**: If you have multiple sessions at the scanner, you should create an *Exam* folder for each session. This will help you to keep the data organized and *Exam* will be reported in the *study_description* in your *dicominfo.tsv*, so that you can use it as a criterion. +* **Don't manually combine DICOMS from different sessions**: If you combine multiple sessions in one subject DICOM folder, heudiconv will fail to run and will complain about ``conflicting study identifiers``. You can get around the problem by figuring out which DICOMs are from different sessions and separating them so you deal with one set at a time. This may mean you have to manually edit the BIDS output. + + * Why might you manually combine sessions you ask? Because you never intended to have multiple sessions, but the subject had to complete some scans the next day. Or, because the scanner had to be rebooted. +* **Don't assume all your subjects' dicoms have the same names or that the sequences were always run in the same order**: If you develop a *heuristic.py* on one subject, try it and carefully evaluate the results on your other subjects. This is especially true if you already collected the data before you started thinking about automating the output. Every time you run HeuDiConv with *heuristic.py*, a new *dicominfo.tsv* file is generated. Inspect this for differences in protocol names and series descriptions etc. +* **Decompressing DICOMS**: Decompress your data, heudiconv does not yet support compressed DICOM conversion. https://github.com/nipy/heudiconv/issues/287 +* **Create unique DICOM protocol names at the scanner** If you have the opportunity to influence the DICOM naming strategies, then try to ensure that there is a unique protocol name for every run. For example, if you repeat the fmri protocol three times, name the first one fmri_1, the next fmri_2, and the last fmri_3 (or any variation on this theme). This will make it much easier to uniquely specify the sequences when you convert and reduce your chance of errors. + + +Exploring Criteria +********************** + +*dicominfo.tsv* contains a human readable version of seqinfo. Each column of data can be used as criteria for identifying the correct DICOM image. We have already provided examples of using string types, numbers, and Booleans (True-False). Tuples (immutable lists) are also available and examples of using these are provided below. To ensure that you are extracting the images you want, you need to be very careful about creating your initial *heuristic.py*. + +Why Experiment? +==================== + +* Criteria can be tricky. Ensure the NIfTI files you create are the correct ones (for example, not the derived or motion corrected if you didn't want that). In addition to looking at the images created (which tells you whether you have a fieldmap or T1w etc.), you should look at the dimensions of the image. Not only the dimensions, but the range of intensity values and the size of the image on disk should match for dcm2niix and heudiconv's *heuristic.py*. +* For really tricky cases, download and install dcm2niix on your local machine and run it for a sequence of concern (in my experience, it is usually fieldmaps that go wrong). +* Although Python does not require you to use parentheses while defining criteria, parentheses are a good idea. Parentheses will help ensure that complex criteria involving multiple logical operators ``and, or, not`` make sense and behave as expected. + +Tuples +--------- + +Suppose you want to use the values in the field ``image_type``? It is not a number or string or Boolean. To discover the data type of a column, you can add a statement like this ``print(type(s.image_type))`` to the for loop in Section 2 of *heuristic.py*. Then run *heuristic.py* (preferably without any actual conversions) and you should see an output like this ````. Here is an example of using a value from ``image_type`` as a criterion:: + + if ('ASL_3D_tra_iso' == s.protocol_name) and ('TTEST' in s.image_type): + info[asl_der].append(s.series_id) + +Note that this differs from testing for a string because you cannot test for any substring (e.g., 'TEST' would not work). String tests will not work on a tuple datatype. + +.. Note:: *image_type* is described in the `DICOM specification `_ + diff --git a/docs/heuristics.rst b/docs/heuristics.rst index b5de52df..b2f13fc5 100644 --- a/docs/heuristics.rst +++ b/docs/heuristics.rst @@ -1,6 +1,6 @@ -========= -Heuristic -========= +=============== +Heuristics File +=============== The heuristic file controls how information about the DICOMs is used to convert to a file system layout (e.g., BIDS). ``heudiconv`` includes some built-in @@ -12,6 +12,14 @@ covered by the existing heuristics. This section will outline what makes up a heuristic file, and some useful functions available when making one. +Provided Heuristics +------------------- + +Running ``heudiconv`` without a heuristic file results in the generation of a skeleton for the user to customize to their needs. + +``heudiconv`` also provides more than 10 additional heuristics, which can be seen `here `_ +These heuristic files are documented in their code comments. + Components ========== diff --git a/docs/index.rst b/docs/index.rst index 5565625b..f17be802 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -14,7 +14,9 @@ Contents installation changes - usage - heuristics tutorials + heuristics + commandline + container api + diff --git a/docs/installation.rst b/docs/installation.rst index 1b9599ee..03b80766 100644 --- a/docs/installation.rst +++ b/docs/installation.rst @@ -4,6 +4,7 @@ Installation ``Heudiconv`` is packaged and available from many different sources. +.. _install_local: Local ===== @@ -21,23 +22,24 @@ subsequently it would be able to download and install dcm2niix binary. On Debian-based systems, we recommend using `NeuroDebian `_, which provides the `heudiconv package `_. +.. _install_container: -Docker -====== -If `Docker `_ is available on your system, you -can visit `our page on Docker Hub `_ -to view available releases. To pull the latest release, run:: +Containers +========== - $ docker pull nipy/heudiconv:latest +Our container image releases are availe on `our Docker Hub `_ -Note that when using HeuDiConv via ``docker run``, you might need to provide your user and group IDs so they map correspondingly -within the container, i.e.:: +If `Docker `_ is available on your system, you can pull the latest release:: - $ docker run --user=$(id -u):$(id -g) -e "UID=$(id -u)" -e "GID=$(id -g)" --rm -t -v $PWD:$PWD nipy/heudiconv:latest [OPTIONS TO FOLLOW] + $ docker pull nipy/heudiconv:latest Additionally, HeuDiConv is available through the Docker image at `repronim/reproin `_ provided by `ReproIn heuristic project `_, which develops the ``reproin`` heuristic. +To maintain provenance, it is recommended that you use the ``latest`` tag only when testing out heudiconv. +Otherwise, it is recommended that you use an explicit version and record that information alongside the produced data. + + Singularity =========== If `Singularity `_ is available on your system, diff --git a/docs/quickstart.rst b/docs/quickstart.rst new file mode 100644 index 00000000..118c77dc --- /dev/null +++ b/docs/quickstart.rst @@ -0,0 +1,92 @@ +Quickstart +========== + +This tutorial is based on `Dianne Patterson's University of Arizona tutorials `_ + +This guide assumes you have already :ref:`installed heudiconv and dcm2niix ` and +demonstrates how to use the heudiconv tool with a provided `heuristic.py` to convert DICOMS into the BIDS data structure. + +.. _prepare_dataset: + +Prepare Dataset +*************** + +Download and unzip `sub-219_dicom.zip `_. + +We will be working from a directory called MRIS. Under the MRIS directory is the *dicom* subdirectory: Under the subject number *219* the session *itbs* is nested. Each dicom sequence folder is nested under the session:: + + dicom + └── 219 + └── itbs + ├── Bzero_verify_PA_17 + ├── DTI_30_DIRs_AP_15 + ├── Localizers_1 + ├── MoCoSeries_19 + ├── MoCoSeries_31 + ├── Post_TMS_restingstate_30 + ├── T1_mprage_1mm_13 + ├── field_mapping_20 + ├── field_mapping_21 + └── restingstate_18 + Nifti + └── code + └── heuristic1.py + +Basic Conversion +**************** + +Next we will use heudiconv convert DICOMS into the BIDS data structure. +The example dataset includes an example heuristic file, `heuristic1.py`. +Typical use of heudiconv will require the creation and editing of your :doc:`heuristics file `, which we will cover +in a :doc:`later tutorial `. + + .. note:: Heudiconv requires you to run the command from the parent + directory of both the Dicom and Nifti directories, which is `MRIS` in + our case. + +Run the following command:: + + heudiconv --files dicom/219/itbs/*/*.dcm -o Nifti -f Nifti/code/heuristic1.py -s 219 -ss itbs -c dcm2niix -b --minmeta --overwrite + + +* We specify the dicom files to convert with `--files` +* The heuristic file is provided with the `-f` option +* We tell heudiconv to place our output in the Nifti dir with `-o` +* `-b` indicates that we want to output in BIDS format +* `--minmeta` guarantees that meta-information in the dcms does not get inserted into the JSON sidecar. This is good because the information is not needed but can overflow the JSON file causing some BIDS apps to crash. + +Output +****** + +The *Nifti* directory will contain a bids-compliant subject directory:: + + + └── sub-219 + └── ses-itbs + ├── anat + ├── dwi + ├── fmap + └── func + +The following required BIDS text files are also created in the Nifti directory. Details for filling in these skeleton text files can be found under `tabular files `_ in the BIDS specification:: + + CHANGES + README + dataset_description.json + participants.json + participants.tsv + task-rest_bold.json + +Validation +********** + +Ensure that everything is according to spec by using `bids validator `_ + +Click `Choose File` and then select the *Nifti* directory. There should be no errors (though there are a couple of warnings). + + .. Note:: Your files are not uploaded to the BIDS validator, so there are no privacy concerns! + +Next +**** + +In the following sections, you will modify *heuristic.py* yourself so you can test different options and understand how to work with your own data. diff --git a/docs/reproin.rst b/docs/reproin.rst new file mode 100644 index 00000000..871d8e4f --- /dev/null +++ b/docs/reproin.rst @@ -0,0 +1,133 @@ +================ +Reproin +================ + +This tutorial is based on `Dianne Patterson's University of Arizona tutorials `_ + +`Reproin `_ is a setup for +automatic generation of sharable, version-controlled BIDS datasets from +MR scanners. + +If you can control how your image sequences are named at the scanner, you can use the *reproin* naming convention. +If you cannot control such naming, or already have collected data, you can provide your custom heuristic mapping into *reproin* and thus in effect use reproin heuristic. +That will be a topic for another tutorial but meanwhile you can checkout `reproin/issues/18 `_ for a brief HOWTO. + +Get Example Dataset +------------------- + +This example uses a phantom dataset: `reproin_dicom.zip `_ generated by the University of Arizona on their Siemens Skyra 3T with Syngo MR VE11c software on 2018_02_08. + +The ``REPROIN`` directory is a simple reproin-compliant DICOM (.dcm) dataset without sessions. +(Derived dwi images (ADC, FA etc.) that the scanner produced have been removed.:: + + [user@local ~/reproin_dicom/REPROIN]$ tree -I "*.dcm" + + REPROIN + ├── data + └── dicom + └── 001 + └── Patterson_Coben\ -\ 1 + ├── Localizers_4 + ├── anatT1w_acqMPRAGE_6 + ├── dwi_dirAP_9 + ├── fmap_acq4mm_7 + ├── fmap_acq4mm_8 + ├── fmap_dirPA_15 + └── func_taskrest_16 + +Convert and organize +-------------------- + +From the ``REPROIN`` directory:: + + heudiconv -f reproin --bids -o data --files dicom/001 --minmeta + +* ``-f reproin`` specifies the converter file to use +* ``-o data/`` specifies the output directory *data*. If the output directory does not exist, it will be created. +* ``--files dicom/001`` identifies the path to the DICOM files. +* ``--minmeta`` ensures that only the minimum necessary amount of data gets added to the JSON file when created. On the off chance that there is a LOT of meta-information in the DICOM header, the JSON file will not get swamped by it. Rumors are that fMRIPrep and MRIQC might be sensitive to excess of metadata and might crash crash, so minmeta provides a layer of protection against such corruption. + + +Output Directory Structure +-------------------------- + +Heudiconv's Reproin converter produces a hierarchy of directories with the BIDS dataset (here - `Cohen`) at the bottom:: + + data + └── Patterson + └── Coben + ├── sourcedata + │   └── sub-001 + │   ├── anat + │   ├── dwi + │   ├── fmap + │   └── func + └── sub-001 + ├── anat + ├── dwi + ├── fmap + └── func + +The specific value for the hierarchy can be specified to HeuDiConv via `--locator PATH` option. +If not, ReproIn heuristic bases it on the value of the DICOM "Study Description" field which is populated when user selects a specific *Exam* card located within some *Region* (see `ReproIn Walkthrough "Organization" `_). + +* The dataset is nested under two levels in the output directory: *Region* (Patterson) and *Exam* (Coben). *Tree* is reserved for other purposes at the UA research scanner. +* Although the Program *Patient* is not visible in the output hierarchy, it is important. If you have separate sessions, then each session should have its own Program name. +* **sourcedata** contains tarred gzipped (`.tgz`) sets of DICOM images corresponding to NIfTI images. +* **sub-001/** contains a single subject data within this BIDS dataset. +* The hidden directory is generated: *REPROIN/data/Patterson/Coben/.heudiconv* to contain derived mapping data, which could potentially be inspected or adjusted/used for re-conversion. + + + +Reproin Scanner File Names +**************************** + +* For both BIDS and *reproin*, names are composed of an ordered series of key-value pairs, called [*entities*](https://github.com/bids-standard/bids-specification/blob/master/src/schema/objects/entities.yaml). + Each key and its value are joined with a dash ``-`` (e.g., ``acq-MPRAGE``, ``dir-AP``). + These key-value pairs are joined to other key-value pairs with underscores ``_``. + The exception is the modality label, which is discussed more below. +* *Reproin* scanner sequence names are simplified relative to the final BIDS output and generally conform to this scheme (but consult the `reproin heuristics file `_ for additional options): ``sequence type-modality label`` _ ``session-session name`` _ ``task-task name`` _ ``acquisition-acquisition detail`` _ ``run-run number`` _ ``direction-direction label``:: + + | func-bold_ses-pre_task-faces_acq-1mm_run-01_dir-AP + +* Each sequence name begins with the seqtype key. The seqtype key is the modality and corresponds to the name of the BIDS directory where the sequence belongs, e.g., ``anat``, ``dwi``, ``fmap`` or ``func``. +* The seqtype key is optionally followed by a dash ``-`` and a modality label value (e.g., ``anat-scout`` or ``anat-T2W``). Often, the modality label is not needed because there is a predictable default for most seqtypes: +* For **anat** the default modality is ``T1W``. Thus a sequence named ``anat`` will have the same output BIDS files as a sequence named ``anat-T1w``: *sub-001_T1w.nii.gz*. +* For **fmap** the default modality is ``epi``. Thus ``fmap_dir-PA`` will have the same output as ``fmap-epi_dir-PA``: *sub-001_dir-PA_epi.nii.gz*. +* For **func** the default modality is ``bold``. Thus, ``func-bold_task-rest`` will have the same output as ``func_task-rest``: *sub-001_task-rest_bold.nii.gz*. +* *Reproin* gets the subject number from the DICOM metadata. +* If you have multiple sessions, the session name does not need to be included in every sequence name in the program (i.e., Program= *Patient* level mentioned above). Instead, the session can be added to a single sequence name, usually the scout (localizer) sequence e.g. ``anat-scout_ses-pre``, and *reproin* will propagate the session information to the other sequence names in the *Program*. Interestingly, *reproin* does not add the localizer to your BIDS output. +* When our scanner exports the DICOM sequences, all dashes are removed. But don't worry, *reproin* handles this just fine. +* In the UA phantom reproin data, the subject was named ``01``. Horos reports the subject number as ``01`` but exports the DICOMS into a directory ``001``. If the data are copied to an external drive at the scanner, then the subject number is reported as ``001_001`` and the images are ``*.IMA`` instead of ``*.dcm``. *Reproin* does not care, it handles all of this gracefully. Your output tree (excluding *sourcedata* and *.heudiconv*) should look like this:: + + . + |-- CHANGES + |-- README + |-- dataset_description.json + |-- participants.tsv + |-- sub-001 + | |-- anat + | | |-- sub-001_acq-MPRAGE_T1w.json + | | `-- sub-001_acq-MPRAGE_T1w.nii.gz + | |-- dwi + | | |-- sub-001_dir-AP_dwi.bval + | | |-- sub-001_dir-AP_dwi.bvec + | | |-- sub-001_dir-AP_dwi.json + | | `-- sub-001_dir-AP_dwi.nii.gz + | |-- fmap + | | |-- sub-001_acq-4mm_magnitude1.json + | | |-- sub-001_acq-4mm_magnitude1.nii.gz + | | |-- sub-001_acq-4mm_magnitude2.json + | | |-- sub-001_acq-4mm_magnitude2.nii.gz + | | |-- sub-001_acq-4mm_phasediff.json + | | |-- sub-001_acq-4mm_phasediff.nii.gz + | | |-- sub-001_dir-PA_epi.json + | | `-- sub-001_dir-PA_epi.nii.gz + | |-- func + | | |-- sub-001_task-rest_bold.json + | | |-- sub-001_task-rest_bold.nii.gz + | | `-- sub-001_task-rest_events.tsv + | `-- sub-001_scans.tsv + `-- task-rest_bold.json + +* Note that despite all the the different subject names (e.g., ``01``, ``001`` and ``001_001``), the subject is labeled ``sub-001``. diff --git a/docs/tutorials.rst b/docs/tutorials.rst index ec12e36a..d8c3f923 100644 --- a/docs/tutorials.rst +++ b/docs/tutorials.rst @@ -1,6 +1,17 @@ -============== -User Tutorials -============== + +================== +Tutorials +================== + +.. toctree:: + + quickstart + custom-heuristic + reproin + + +External Tutorials +****************** Luckily(?), we live in an era of plentiful information. Below are some links to other users' tutorials covering their experience with ``heudiconv``. diff --git a/docs/usage.rst b/docs/usage.rst deleted file mode 100644 index 8befdc99..00000000 --- a/docs/usage.rst +++ /dev/null @@ -1,107 +0,0 @@ -===== -Usage -===== - -``heudiconv`` processes DICOM files and converts the output into user defined -paths. - -CommandLine Arguments -====================== - -.. argparse:: - :ref: heudiconv.cli.run.get_parser - :prog: heudiconv - :nodefault: - :nodefaultconst: - - -Support -======= - -All bugs, concerns and enhancement requests for this software can be submitted here: -https://github.com/nipy/heudiconv/issues. - -If you have a problem or would like to ask a question about how to use ``heudiconv``, -please submit a question to `NeuroStars.org `_ with a ``heudiconv`` tag. -NeuroStars.org is a platform similar to StackOverflow but dedicated to neuroinformatics. - -All previous ``heudiconv`` questions are available here: -http://neurostars.org/tags/heudiconv/ - - -Batch jobs -========== - -``heudiconv`` can natively handle multi-subject, multi-session conversions -although it will do these conversions in a linear manner, i.e. one subject and one session at a time. -To speed up these conversions, multiple ``heudiconv`` -processes can be spawned concurrently, each converting a different subject and/or -session. - -The following example uses SLURM and Singularity to submit every subjects' -DICOMs as an independent ``heudiconv`` execution. - -The first script aggregates the DICOM directories and submits them to -``run_heudiconv.sh`` with SLURM as a job array. - -If using bids, the ``notop`` bids option suppresses creation of -top-level files in the bids directory (e.g., -``dataset_description.json``) to avoid possible race conditions. -These files may be generated later with ``populate_templates.sh`` -below (except for ``participants.tsv``, which must be created -manually). - -.. code:: shell - - #!/bin/bash - - set -eu - - # where the DICOMs are located - DCMROOT=/dicom/storage/voice - # where we want to output the data - OUTPUT=/converted/data/voice - - # find all DICOM directories that start with "voice" - DCMDIRS=(`find ${DCMROOT} -maxdepth 1 -name voice* -type d`) - - # submit to another script as a job array on SLURM - sbatch --array=0-`expr ${#DCMDIRS[@]} - 1` run_heudiconv.sh ${OUTPUT} ${DCMDIRS[@]} - - -The second script processes a DICOM directory with ``heudiconv`` using the built-in -`reproin` heuristic. - -.. code:: shell - - #!/bin/bash - set -eu - - OUTDIR=${1} - # receive all directories, and index them per job array - DCMDIRS=(${@:2}) - DCMDIR=${DCMDIRS[${SLURM_ARRAY_TASK_ID}]} - echo Submitted directory: ${DCMDIR} - - IMG="/singularity-images/heudiconv-latest-dev.sif" - CMD="singularity run -B ${DCMDIR}:/dicoms:ro -B ${OUTDIR}:/output -e ${IMG} --files /dicoms/ -o /output -f reproin -c dcm2niix -b notop --minmeta -l ." - - printf "Command:\n${CMD}\n" - ${CMD} - echo "Successful process" - -This script creates the top-level bids files (e.g., -``dataset_description.json``) - -.. code:: shell - - #!/bin/bash - set -eu - - OUTDIR=${1} - IMG="/singularity-images/heudiconv-latest-dev.sif" - CMD="singularity run -B ${OUTDIR}:/output -e ${IMG} --files /output -f reproin --command populate-templates" - - printf "Command:\n${CMD}\n" - ${CMD} - echo "Successful process"