Skip to content

Commit

Permalink
updates to the bulk uploading documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
crisaless committed Nov 29, 2023
1 parent 5255ac5 commit d9525cc
Show file tree
Hide file tree
Showing 8 changed files with 60 additions and 54 deletions.
10 changes: 3 additions & 7 deletions docs/_data/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
# url: sequencing-metabolomics-imc-data
# - sectionTitle: Uploading sequencing workflow files
# url: uploading-sequencing-workflow-files
- sectionTitle: Uploading files via DERIVA client tools
- sectionTitle: Bulk Uploading Files with DERIVA Client Tools
url: uploading-files-using-deriva-client-tools
- sectionTitle: Submitting Single Cell Visualization Files
url: single-cell-visualization-files
Expand All @@ -44,16 +44,12 @@
# Section 4
- title: Submitting Specimen Records (For imaging data - and metadata about bio-specimens)
docs:
# - sectionTitle: Submitting Specimen Records
# url: specimen
- sectionTitle: Submitting Specimen Records
url: specimen-v2
- sectionTitle: Bulk Uploading Image Files
url: bulk-uploading-image-files
- sectionTitle: Bulk uploading Specimen data from a file(DRAFT)
url: bulk-uploading-specimen-data-from-a-file
- sectionTitle: Possible Directory Structures For Uploading Image Data
url: possible-directory-structures-for-uploading-image-data
- sectionTitle: Adding Specimen Records From a File (DRAFT)
url: adding-specimen-records-from-a-file
- sectionTitle: Annotating images
url: annotating-images
- sectionTitle: Color palette for image annotation
Expand Down
2 changes: 1 addition & 1 deletion docs/_docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ For members of ATLAS-D2K (specifically members of GUDMAP and RBK) to submit data
- [Color Palette for Image Annotation](../color-palette-for-image-annotation)

**To upload your data files in bulk:**
- [Uploading Files via the DERIVA Client Tools (commmandline and GUI versions available)](../uploading-files-using-deriva-client-tools)
- [Bulk Uploading Files with DERIVA Client Tools (commmandline and GUI versions available)](../uploading-files-using-deriva-client-tools)

## Images and Videos
- [Thumbnail creation guideline](../thumbnail-creation-guideline)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,15 @@
---
title: Adding specimen records from a spreadsheet or CSV
permalink: /docs/bulk-uploading-specimen-data-from-a-file/
title: Adding Specimen Records From a File (Draft)
permalink: /docs/adding-specimen-records-from-a-file/
---

If you're not familiar with the process of submitting Specimen records, please see our [Submitting Specimen Data](/docs/specimens/) page (and/or the tutorial slides linked there) for an overview.
The easiest way to create Specimen records is through the web interface, as described in that wiki page and tutorial. However, if you have a large number of specimens to enter, you might find it more convenient to upload a CSV file instead. This is just a replacement for [step 2 of "Submitting Specimen Data"](/docs/specimens/#2-create-a-specimen-record); you'll still need to add anatomy and any relevant antibodies, probes, etc., as described on that page.
_This is draft content. Contact [email protected] to submit corrections or ask questions._

If you're not familiar with the process of submitting Specimen records, please see our [Submitting Specimen Data](../specimen-v2/) page (and/or the tutorial slides linked there) for an overview.
The easiest way to create Specimen records is through the web interface, as described in that wiki page and tutorial. However, if you have a large number of specimens to enter, you might find it more convenient to upload a CSV file instead. This is just a replacement for [step 2 of "Submitting Specimen Data"](../specimen-v2/#2-create-a-specimen-record); you'll still need to add anatomy and any relevant antibodies, probes, etc., as described on that page.

## Step 1: create the spreadsheet.
First, download [this template](/assets/files/Specimen_simple.xlsx).
First, download [this template]({{ "/assets/files/Specimen_simple.xlsx" | relative_url }}).

This spreadsheet has multiple tabs. The only one you need to fill out is the first one, labelled `Specimen_simple` (the other tabs provide values for drop-down menus that appear in the main tab).

Expand All @@ -26,10 +28,12 @@ Important: do not change any column headings or add or delete any columns. Fill
* Fixation - required for image data. Select a value from the drop-down menu
* Upload_Notes - optional. Enter any free-form notes here.

![CSV-to-specimen relationships]/specimen-imgs/uploader/csv_to_spec.jpeg)
![CSV-to-specimen relationships]({{ "/assets//specimen-imgs/uploader/csv_to_spec.jpeg" | relative_url }})

## Step 2: Create the CSV file, and put it where the DERIVA upload client can find it

## Step 2: Create the CSV file, and put it where the deriva upload client can find it.
After you've saved your spreadsheet as an xlsx file, save another copy as CSV (that's plain CSV, not UTF-8 CSV). In order for the upload tool to find it, you'll need to create:

* A folder called `deriva`.
* Within `deriva`, a folder called `records`
* Within `records`, a folder called `Gene_Expression`
Expand All @@ -40,13 +44,14 @@ After you've saved your spreadsheet as an xlsx file, save another copy as CSV (t
In this picture, both the CSV and Excel file are in the Gene_Expression folder; only the CSV file is necessary, but it's convenient to keep them together.

## Step 3: Upload the CSV file
Follow [these instructions](/docs/uploading-files-using-deriva-client-tools) to upload your file, pointing the uploader at the `records` subfolder.
Follow [these instructions](../uploading-files-using-deriva-client-tools) to upload your file, pointing the uploader at the `records` subfolder.

## Step 4: Upload images
To upload them one at a time through the browser, follow the instructions at [Submitting Specimen Data](/docs/specimens/). To upload them in bulk, follow the instructions at [Bulk uploading image files](/docs/bulk-uploading-image-files).
To upload them one at a time through the browser, follow the instructions at [Submitting Specimen Data](../specimen-v2/). To upload them in bulk, follow the instructions at [Bulk uploading image files](../bulk-uploading-image-files).

## Step 5: Fill in any missing data
Follow the instructions at [Submitting Specimen Data](/docs/specimens/) to add anatomical sites, antibodies, etc.
Follow the instructions at [Submitting Specimen Data](../specimen-v2/) to add anatomical sites, antibodies, etc.

## Step 6: Update the curation status
Your new records will initially be marked "In Preparation" and will be visible only to consortium members. Change their status to "Submitted" to forward them on to the biocurator for review and release.

Your new records will initially have a _Curation Status_ of "In Preparation" and will be visible only to consortium members. Change the status to "Submitted" to forward them on to the biocurator for review and release.
16 changes: 9 additions & 7 deletions docs/_docs/submitting_data/Bulk-uploading-image-files.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
---
title: Bulk uploading image files
title: Bulk Uploading Image Files
permalink: /docs/bulk-uploading-image-files/
---

The easiest way to add imaging data is via the web interface, using the method described in [Submitting Specimen Data](/docs/specimens/). If you have many files to upload, however, it may be more convenient to use the Deriva bulk uploader tool.
The easiest way to add imaging data is via the web interface, using the method described in [Submitting Specimen Data](../specimen-v2/). If you have many files to upload, however, it may be more convenient to use the DERIVA bulk uploader tool.

If you're not already familiar with the concepts and steps described in the [Submitting Specimen Data](/docs/specimens/) page, please review it and/or the tutorials linked there. The bulk upload tool is a replacement for Step 4 (Add Image Records) from that page.
If you're not already familiar with the concepts and steps described in the [Submitting Specimen Data](../specimen-v2/) page, please review them. The bulk upload tool is a replacement for Step 4 (Add Image Records) from that page.

## Step 1: Create Specimen records
Create specimen records using either the the web interface, as described in [Submitting Specimen Data](/docs/specimens/), or the bulk upload process described in [Adding specimen records from a spreadsheet or CSV](/docs/bulk-uploading-specimen-data-from-a-file). Make sure you assign each record a unique Internal ID; the Internal ID is used to link image files with the corresponding Specimen records.
Create specimen records using either the the web interface, as described in [Submitting Specimen Data](../specimen-v2/), or the CSV process described in [Adding specimen records from a file](../adding-specimen-records-from-a-file/).

## Step 2: Create image files, and put them where the upload tool can find them.
**Make sure you assign each record a unique Internal ID**; the Internal ID is used to link image files with the corresponding Specimen records.

## Step 2: Put images where the upload tool can find them
In order for the upload tool to find your image files, you'll need to organize your files as follows:

### For 2D files:
Expand Down Expand Up @@ -50,10 +52,10 @@ Where:
* Note that a hyphen (`-`) is used to separate the `{internal_id}` and the `{image-type}` in the image filename.

## Step 3: Upload the files
Follow [these instructions](/docs/uploading-files-using-deriva-client-tools) to upload your file, pointing the uploader at the `images` subfolder.
Follow [these instructions](../uploading-files-using-deriva-client-tools/) to upload your file, pointing the uploader at the `images` subfolder.

## Step 4: Fill in any missing data
Follow the instructions at [Submitting Specimen Data](/docs/specimens/) to add anatomical sites, antibodies, etc.
Follow the instructions at [Submitting Specimen Data](../specimen-v2/) to add anatomical sites, antibodies, etc.

## Step 5: Update the curation status
Your new records will initially be marked "In Preparation" and will be visible only to consortium members. Change the Specimen records' status to "Submitted" to forward them on to the biocurator for review and release.
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ There are 2 different ways to upload files to our data repository:
* Replicate-level files: On the `Replicate` detail page, click `Add` on top of the `File` table section to add sequencing and analysis files associated with a specific replicate. Normally, users will need to upload the actual files to the Hub. For sequencing files that are archived in other permanent repositories (e.g. GEO), a URL to get to the archive can be provided. For human-protected sequencing file stored in dbGaP, please provide `dbGaP Accession ID`.
* Study-level files: On the `Study` detail page, click `Add` on top of the `Study Analysis File` table section to add new analysis files associated with your study.

2. Through [DERIVA client tools](/docs/uploading-files-using-deriva-client-tools). This approach is recommended in the case that there are many very large files (e.g. bigger than 5 GB) to upload. You will need to [install the client tool](/docs/uploading-files-using-deriva-client-tools) on your system and prepare your directory structure.
2. Through [DERIVA client tools](../uploading-files-using-deriva-client-tools). This approach is recommended in the case that there are many very large files (e.g. bigger than 5 GB) to upload. You will need to [install the client tool](../uploading-files-using-deriva-client-tools) on your system and prepare your directory structure.

#### 4.1. Supported file extensions

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ This is a very secure and stable service, and if the job is interrupted, the pro

If your directory structure and naming convention are correct, the files will be automatically attached to the correct Replicate records.

For full instructions, go to [Upload Using DERIVA Client Tools](../uploading-files-using-deriva-client-tools/).
For full instructions, go to [Bulk Uploading Files with DERIVA Client Tools](../uploading-files-using-deriva-client-tools/).

## 6. Export for GEO submission

Expand Down
8 changes: 5 additions & 3 deletions docs/_docs/submitting_data/Submitting-Specimen-Data-v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,8 @@ A specimen refers to tissue from an organism like humans, mice, or zebrafish. Be
- Fill out the essential fields marked with a red asterisk (*). However, providing more details enhances data discoverability.
- Upon completion, save the record.

Alternatively, you can create specimen records using a CSV file or spreadsheet. See [Adding specimen records from a file](../adding-specimen-records-from-a-file/) for more details.

### 2.1 Field descriptions

The following are required fields:
Expand All @@ -64,7 +66,7 @@ The following fields are especially helpful for bringing up your data in search
| **Sex** | *Male*, *Female*, *Unknown* or *Both*. |
| **Preparation**, **Fixation** and **Embedding** | These fields describe how the sample was prepared. |

| **Internal ID** | A useful field for your own lab’s tracking purposes. This can be very useful to help you search for your Specimen records later. |
| **Internal ID** | An important field for your own lab’s tracking purposes. This can be very useful to help you search for your Specimen records later and is critical if [using the DERIVA client tools](../uploading-files-using-deriva-client-tools/) to bulk upload data. |
| **Strain**, **Wild Type**, **Phenotype**, **Cell Line** | Use these fields to provide helpful information about the specimen. |
| **Upload Notes**, **Probe Usage Notes** | These fields are an opportunity to provide more context about the data. |
| **Parent Specimen** | If you have subdivided a biological sample (e.g., you created sections from a sample), you can create a Specimen record for the original sample. Then for each section (child), designate the original sample as the "parent". You can continue doing this to show different levels of grandparent/parent/child relationships. |
Expand Down Expand Up @@ -109,14 +111,14 @@ Submit CZI files if possible. These files get converted for in-browser viewing,

#### 4.2.1. Clone Records

If you need to upload multiple image records simultaneously, use the `Clone` button:
If you need to upload multiple image records simultaneously through the web interface, use the `Clone` button:

- Fill in the values of the field in the form that will be common across multiple records.
- Click the `Clone` button to create additional forms and edit individual ones as needed.

#### 4.2.2. Use DERIVA Client Tools

If you have many files or your files are very large (over 5GB), then you may need to use our client tools for uploading files programmatically. See [Bulk Uploading Image Files](bulk-uploading-image-files/) for more details.
If you have many files or your files are very large (over 5GB), then you may need to use our client tools for uploading files programmatically. See [Bulk Uploading Image Files](../bulk-uploading-image-files/) for more details.

## 5. Add a new Human Reference Atlas (HRA) 3D coordinate

Expand Down
Loading

0 comments on commit d9525cc

Please sign in to comment.