-
Notifications
You must be signed in to change notification settings - Fork 13
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #43 from ENCODE-DCC/dev1.3
Dev1.3
- Loading branch information
Showing
22 changed files
with
297 additions
and
296 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
[flake8] | ||
ignore = E501,W503, W605, E203 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[settings] | ||
known_third_party = dataframe_utils,pandas,qc_utils | ||
multi_line_output=3 | ||
include_trailing_comma=True | ||
force_grid_wrap=0 | ||
use_parentheses=True | ||
line_length=88 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
- repo: https://github.com/psf/black | ||
rev: 19.3b0 | ||
hooks: | ||
- id: black | ||
language_version: python3.7 | ||
|
||
- repo: https://github.com/asottile/seed-isort-config | ||
rev: v1.9.2 | ||
hooks: | ||
- id: seed-isort-config | ||
|
||
- repo: https://github.com/pre-commit/mirrors-isort | ||
rev: v4.3.21 | ||
hooks: | ||
- id: isort | ||
language_version: python3.7 | ||
|
||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v2.2.3 | ||
hooks: | ||
- id: flake8 | ||
- id: trailing-whitespace | ||
exclude: docs/\w+\.md | ||
- id: end-of-file-fixer | ||
- id: debug-statements | ||
- id: check-json | ||
- id: pretty-format-json | ||
args: | ||
- --autofix | ||
- id: check-yaml |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,65 +1,53 @@ | ||
# INSTALLATION | ||
|
||
To run the pipeline you need to install following software. Running the pipeline on Google Cloud requires additional setup detailed below. | ||
To run the pipeline you need to do some setup. The exact steps you need to take depends on the platform you are running the pipeline on, and will be detailed below and in [HOWTO](howto.md). Independent of platform, running the pipeline is done using [caper](https://github.com/ENCODE-DCC/caper) and (optional but recommended) output organization is done using [croo](https://github.com/ENCODE-DCC/croo). Both `caper` and `croo` require `python` version 3.4.1 or newer. | ||
|
||
## Caper | ||
|
||
Direct usage of the execution engine [Cromwell](https://software.broadinstitute.org/wdl/documentation/execution) features complicated backend configuration, workflow options and command line parameters. Caper hides the complexity and consolidates configuration in one file. Caper is available in [PyPI](https://pypi.org/project/caper/) and it is installed by running: | ||
|
||
```bash | ||
$ pip install caper | ||
``` | ||
|
||
Note that conda run mode that is described in caper documentation is not supported by this pipeline. | ||
|
||
## Croo | ||
|
||
The way [Cromwell](https://software.broadinstitute.org/wdl/documentation/execution) organizes pipeline outputs is not always the most clear and findable. Croo is a tool to reorganize the files in more readable manner. Croo is available in [PyPI](https://pypi.org/project/croo/) and it is installed by running: | ||
|
||
```bash | ||
$ pip install croo | ||
``` | ||
|
||
## Java 8 | ||
|
||
Java is required to run execution engine [Cromwell](https://software.broadinstitute.org/wdl/documentation/execution). | ||
Java is required to run execution engine [Cromwell](https://software.broadinstitute.org/wdl/documentation/execution) that `caper` uses under the hood. | ||
To check which Java version you already have, run: | ||
```bash | ||
$ java -version | ||
``` | ||
You are looking for 1.8 or higher. If the requirement is not fulfilled follow installation instructions for [mac](https://java.com/en/download/help/mac_install.xml) or | ||
[linux](http://openjdk.java.net/install/) or use your favorite installation method. | ||
|
||
## Cromwell | ||
|
||
Download WDL runner Cromwell from [here](https://github.com/broadinstitute/cromwell/releases). The pipeline has been tested using version 40. | ||
|
||
## Docker | ||
|
||
Pipeline code is packaged and distributed in Docker containers, and thus Docker installation is needed. | ||
Pipeline code is packaged and distributed in Docker containers, and thus Docker installation is needed. | ||
Follow instructions for [mac](https://docs.docker.com/docker-for-mac/install/) or [linux](https://docs.docker.com/install/linux/docker-ce/ubuntu/#upgrade-docker-after-using-the-convenience-script). | ||
|
||
## Caper | ||
## Singularity | ||
|
||
For running the pipeline we recommend using [Caper](https://github.com/ENCODE-DCC/caper) that wraps Cromwell in an easier to use package. | ||
If you want to use Singularity instead of Docker, install [singularity](https://www.sylabs.io/guides/3.1/user-guide/installation.html). Pipeline requires singularity version `>=2.5.2`, the link takes you to version `3.1`. | ||
|
||
## croo | ||
## Google Cloud | ||
|
||
For organizing pipeline outputs we recommend using [croo](https://github.com/ENCODE-DCC/croo) that makes a nicely organized directory from the complicated output tree Cromwell defaults to. The configuration file for `croo` is named `output_definition.json` and can be found in the root of this repository. | ||
If you are intending to run the pipeline on Google Cloud platform, follow the [caper setup instructions for GCP](https://github.com/ENCODE-DCC/caper/blob/master/docs/conf_gcp.md). | ||
* For an example on how to run the pipeline on Google Cloud, see [HOWTO](howto.md#google-cloud). | ||
|
||
## Singularity | ||
## AWS | ||
|
||
If for some reason you cannot run Docker, install [singularity](https://www.sylabs.io/guides/3.1/user-guide/installation.html) and have a look at [HOWTO](howto.md#local-with-singularity) for an example of how to run pipeline with singularity. Pipeline requires singularity version `>=2.5.2`, the link takes you to version `3.1`. | ||
If you are intending to run the pipeline on AWS, follow the [caper setup instructions for AWS](https://github.com/ENCODE-DCC/caper/blob/master/docs/conf_aws.md). | ||
|
||
## Google Cloud | ||
## Cromwell (optional) | ||
|
||
If you are intending to run the pipeline on Google Cloud platform, the following setup is needed: | ||
|
||
1. Sign up for a Google account. | ||
2. Go to [Google Project](https://console.developers.google.com/project) page and click "SIGN UP FOR FREE TRIAL" on the top left and agree to terms. | ||
3. Set up a payment method and click "START MY FREE TRIAL". | ||
4. Create a [Google Project](https://console.developers.google.com/project) `[YOUR_PROJECT_NAME]` and choose it on the top of the page. | ||
5. Create a [Google Cloud Storage bucket](https://console.cloud.google.com/storage/browser) `gs://[YOUR_BUCKET_NAME]` by clicking on a button "CREATE BUCKET" and create it to store pipeline outputs. | ||
6. Find and enable following APIs in your [API Manager](https://console.developers.google.com/apis/library). Click a back button on your web brower after enabling each. | ||
* Compute Engine API | ||
* Google Cloud Storage | ||
* Google Cloud Storage JSON API | ||
* Genomics API | ||
|
||
7. Install [Google Cloud Platform SDK](https://cloud.google.com/sdk/downloads) and authenticate through it. You will be asked to enter verification keys. Get keys from the URLs they provide. | ||
``` | ||
$ gcloud auth login --no-launch-browser | ||
$ gcloud auth application-default login --no-launch-browser | ||
``` | ||
8. If you see permission errors at runtime, then unset environment variable `GOOGLE_APPLICATION_CREDENTIALS` or add it to your BASH startup scripts (`$HOME/.bashrc` or `$HOME/.bash_profile`). | ||
``` | ||
unset GOOGLE_APPLICATION_CREDENTIALS | ||
``` | ||
9. Set your default Google Cloud Project. Pipeline will provision instances on this project. | ||
``` | ||
$ gcloud config set project [YOUR_PROJECT_NAME] | ||
``` | ||
We recommend using `caper` for running the pipeline, although it is possible to use Cromwell directly. Backend file and workflow options files necessary for direct Cromwell use are included in the repository for local testing purposes, but they are not actively maintained to follow cloud API changes etc. |
Oops, something went wrong.