ICR142 validation in bcbio

Support running an ICR142 validation using bcbio

http://f1000research.com/articles/5-386/v1

Running validation

This repository contains a full set of configuration files and BED/VCF validation sets to run an analysis with bcbio:

Obtain the ICR142 fastq files, which require applying for access. Move these to bcbiorun/input/fastqs
Run the analysis using an installed version of bcbio. This can run on a single machine using multiple cores or distributed on a cluster:
```
cd bcbiorun/work
bcbio_nextgen.py ../config/icr142.yaml -n 16
```

Summarize and plot the results:

cd ../summarize
bcbio_python ../../scripts/combine_samples.py
bcbio_python ../../scripts/bcbio_validation_plot.py icr142-summary.csv

Results

Validation using bwa-mem and 3 variant callers (GATK HaplotypeCaller, FreeBayes and VarDict), including ensemble regions with calls in 2 of our 3 or 3 out of 3 callers. The majority of false positives are present in at least 2 callers, and many in all 3:

Truth set preparation

We prepared the truth set and analysis regions using the truth set calls from Supplemental table 1: scripts/icr_to_vcf.py created the VCF and BED files contained in the repository from the original table and a list of variants found to be homozygous (both in bcbiorun/input). The initial truth table does not have information about whether exepcted variants are homozygous or heterozygous so we ran an intial validation with everything heterozygous, then used scripts/find_hethomerrors.py to find those variants that are likely homozygous to reprepare the final truth set.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bcbiorun		bcbiorun
results		results
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICR142 validation in bcbio

Running validation

Results

Truth set preparation

About

Releases

Packages

Languages

bcbio/icr142-validation

Folders and files

Latest commit

History

Repository files navigation

ICR142 validation in bcbio

Running validation

Results

Truth set preparation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages