Skip to content

Commit

Permalink
Merge branch 'tutorial_helper'
Browse files Browse the repository at this point in the history
  • Loading branch information
lemieuxl committed Apr 8, 2016
2 parents dc1f408 + 0db25ff commit 5221a86
Show file tree
Hide file tree
Showing 14 changed files with 1,037 additions and 443 deletions.
34 changes: 20 additions & 14 deletions README.mkd
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,14 @@

# genipe - A Python module to perform genome-wide imputation analysis

*Version 1.2.2*

The `genipe` module (standing for **GEN**ome-wide **I**mputation
**P**ipelin**E**) includes a script (named `genipe-launcher`) that
automatically runs a genome-wide imputation pipeline using *Plink*, *shapeit*
and *impute2*.


## Documentation

Full documentation is available at
[http://pgxcentre.github.io/genipe/](http://pgxcentre.github.io/genipe/).

Expand All @@ -30,8 +31,12 @@ pip install genipe
conda install genipe -c http://statgen.org/wp-content/uploads/Softwares/genipe
```

The installation process should install all required dependencies, which are
described below.
The installation process should install all required dependencies to run the
main imputation pipeline. Optional dependencies can also be installed manually
in order to perform statistical analysis and data management (see below).

The complete installation procedure is available in the
[documentation](http://pgxcentre.github.io/genipe/installation.html).


### Dependencies
Expand All @@ -49,18 +54,19 @@ The tool requires the binaries for
[shapeit](https://mathgen.stats.ox.ac.uk/genetics_software/shapeit/shapeit.html#download)
and [impute2](https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#download).

A tool is provided to perform statistical analysis using the imputation results
(linear, logistic and Cox's regressions). This tool requires the following
Python modules:

* `scipy` version 0.15.1 or latest (required by `statsmodels`)
* `lifelines` version 0.7.0.0 and latest
* `statsmodels` version 0.6.1 and latest
### Optional dependencies

Other packages are optional:
In order to perform data management and statistical analysis (linear, logistic
and Cox's regressions), `genipe` requires the following Python modules:

* `Matplotlib` version 1.4.2
* `Matplotlib` version 1.4.2 or latest
* `scipy` version 0.15.1 or latest
* `statsmodels` version 0.6.1 or latest
* `lifelines` version 0.7.0 or latest
* `Biopython` version 1.65 or latest
* `pyfaidx` version 0.3.7 or latest
* `drmaa` version 0.7.6 or latest

Finally, the tool requires a LaTeX installation to compile the automatically
generated report in PDF format.
Expand Down Expand Up @@ -106,7 +112,7 @@ usage: genipe-launcher [-h] [-v] [--debug] [--thread THREAD] --bfile PREFIX
[--report-background BACKGROUND]

Execute the genome-wide imputation pipeline. This script is part of the
'genipe' package, version 1.2.2.
'genipe' package, version 1.2.3.

optional arguments:
-h, --help show this help message and exit
Expand Down Expand Up @@ -217,7 +223,7 @@ usage: imputed-stats [-h] [-v] {cox,linear,logistic,mixedlm,skat} ...

Performs statistical analysis on imputed data (either SKAT analysis, or
linear, logistic or survival regression). This script is part of the 'genipe'
package, version 1.2.2).
package, version 1.2.3).

optional arguments:
-h, --help show this help message and exit
Expand Down
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Usage
[--report-background BACKGROUND]
Execute the genome-wide imputation pipeline. This script is part of the
'genipe' package, version 1.2.2.
'genipe' package, version 1.2.3.
optional arguments:
-h, --help show this help message and exit
Expand Down
29 changes: 15 additions & 14 deletions docs/installation.rst
Original file line number Diff line number Diff line change
@@ -1,14 +1,7 @@
Installation
=============


Quick navigation
-----------------

1. :ref:`install-requirements`
2. :ref:`install-virt`
3. :ref:`install-test`
4. :ref:`install-update`
.. contents:: Quick navigation


.. _install-requirements:
Expand All @@ -32,6 +25,7 @@ management):
* ``lifelines`` version 0.7.0 or latest
* ``biopython`` version 1.65 or latest
* ``pyfaidx`` version 0.3.7 or latest
* ``drmaa`` version 0.7.6 or latest

.. note::

Expand All @@ -43,8 +37,11 @@ management):
Installing in a virtual environment
------------------------------------

Using pyvenv
^^^^^^^^^^^^^

.. _install-pyvenv:

Using python's pyvenv
^^^^^^^^^^^^^^^^^^^^^^

The following commands should successfully create a virtual environment and
activate it, as long as Python 3 was previously installed on the machine.
Expand Down Expand Up @@ -75,7 +72,10 @@ command:
pip install biopython
pip install pyfaidx
pip install matplotlib
pip install drmaa
.. _install-miniconda:

Using Miniconda
^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -126,6 +126,7 @@ command:
conda install -y statsmodels
conda install -y biopython
conda install -y matplotlib
conda install -y drmaa
pip install --no-deps pyfaidx
pip install --no-deps lifelines
Expand Down Expand Up @@ -186,16 +187,16 @@ command (depending of the installation method). Don't forget to first activate
the python virtual environment.


Pyvenv
^^^^^^^
Using python's pyvenv
^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: bash
pip install -U genipe
Miniconda
^^^^^^^^^^
Using Miniconda
^^^^^^^^^^^^^^^^

.. code-block:: bash
Expand Down
9 changes: 9 additions & 0 deletions docs/module_content/genipe.tools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,12 @@ genipe.tools.imputed_stats module
:undoc-members:
:show-inheritance:


genipe.tools.genipe_tutorial
-----------------------------

.. automodule:: genipe.tools.genipe_tutorial
:members:
:undoc-members:
:show-inheritance:

14 changes: 7 additions & 7 deletions docs/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ genome-wide imputation of genotypes. It describes in detail the expected input
and output files.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_genipe

Expand All @@ -40,7 +40,7 @@ Cox's proportional hazard model). It describes in detail the expected input and
output files.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_cox

Expand All @@ -54,7 +54,7 @@ This tutorial walks through a full example of a typical SKAT analysis while
describing the expected input and output files.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_SKAT

Expand All @@ -69,7 +69,7 @@ This tutorial walks through a full example of a linear regression analysis
output files.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_linear

Expand All @@ -84,7 +84,7 @@ This tutorial walks through a full example of a logistic regression analysis
detail the expected input and output file.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_logistic

Expand All @@ -98,7 +98,7 @@ This tutorial walks through a full example of a linear mixed effects analysis.
It describes in detail the expected input and output files.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_mixedlm

Expand All @@ -113,7 +113,7 @@ from imputation files. It describes in detail the expected input and output
file.

.. toctree::
:maxdepth: 3
:maxdepth: 2

tutorials/tutorial_extract

36 changes: 14 additions & 22 deletions docs/tutorials/tutorial_SKAT.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,10 @@
SKAT Tutorial
==============

Quick navigation
-----------------
.. contents:: Quick navigation
:depth: 2

1. :ref:`skat-tut-1`
2. :ref:`skat-tut-2`
3. :ref:`skat-tut-3`
4. :ref:`skat-tut-4`
5. :ref:`skat-tut-5`
6. :ref:`skat-tut-6`

SKAT analysis tutorial
-----------------------
SKAT analysis
==============

`SKAT <http://www.hsph.harvard.edu/skat/>`_ is a very popular test for the
association of sets of rare and common variants. In :py:mod:`genipe`, we
Expand All @@ -27,10 +19,10 @@ step.
.. _skat-tut-1:

Input files generated by genipe
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--------------------------------

Impute2
""""""""
^^^^^^^^

After running the pipeline, you should have an `Impute2` file containing
genotype probabilities for all the imputed variants. The general structure of
Expand Down Expand Up @@ -73,7 +65,7 @@ be the minor allele.
to facilitate the description of the format.

Samples file
"""""""""""""
^^^^^^^^^^^^^

This file is generated by :py:mod:`genipe` and has the ``.sample`` extension.
It greatly resembles the PLINK ``fam`` file for those who are familiar with
Expand Down Expand Up @@ -122,7 +114,7 @@ assume that the A1 is the major allele, which is not true in practice):
the major allele, which is often not true in practice.

Good sites file
""""""""""""""""
^^^^^^^^^^^^^^^^

The `.good_sites` file is also generated by :py:mod:`genipe`. It contains a
single column representing the name of the variants that passed the quality
Expand All @@ -135,7 +127,7 @@ tool.
.. _skat-tut-2:

Creating a SNP sets file
^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------

SKAT is based on the analysis of a "SNP set". This is simply an arbitrary set
of variants that is created by the user. All of the variants in a same SNP set
Expand Down Expand Up @@ -195,7 +187,7 @@ provide them in the SNP set file as shown here.
.. _skat-tut-3:

Format for the phenotypes file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------------------

The last required file is the phenotype file. The latter contains information
on the phenotype of interest for all the samples included in the analysis. The
Expand Down Expand Up @@ -227,7 +219,7 @@ the ``--outcome-type`` argument which can be set to `discrete` or `continuous`.
.. _skat-tut-4:

Running the script
^^^^^^^^^^^^^^^^^^^
-------------------

Once all of this is ready, you can finally run the script. A sample command
for the analysis described throughout this tutorial is the following:
Expand Down Expand Up @@ -290,7 +282,7 @@ The line by line explanation of this command is as follows:
.. _skat-tut-5:

Results
^^^^^^^^
--------

After running this analysis, a file named `my_skat_analysis.skat.dosage` should
appear. This file will contain an association p-value for every one of the
Expand Down Expand Up @@ -324,7 +316,7 @@ new interesting feature, please send us a push request from Github.
.. _skat-tut-6:

Usage
^^^^^^
------

The following command will display the documentation for the SKAT analysis in
the console:
Expand All @@ -342,7 +334,7 @@ the console:
{continuous,discrete} [--skat-o] --pheno-name NAME
Uses the SKAT R package to analyze user defined gene sets. This script is part
of the 'genipe' package, version 1.2.2).
of the 'genipe' package, version 1.2.3).
optional arguments:
-h, --help show this help message and exit
Expand Down
Loading

0 comments on commit 5221a86

Please sign in to comment.