This pipeline goes from RNA-seq (or similar) data to a table of total allelic counts per gene (or other genomic interval). That table serves as input for the further analysis of allelic imbalance with Qllelic.
This is a re-implementation of the ASEReadCounter
tool from GATK, based on allelecounter scripts by S.Castel.
The pipeline consists of two main parts:
-
Reference preparation
construct individual "paternal" and "maternal" genome references, create heterozygous VCF.
-
Creation of tables with allelic counts
- map sequencing reads to references (using STAR aligner; see complete list of dependencies)
- perform random sampling of the mapped reads to defined depth (key step for overdispersion analysis in Qllelic)
- count the number of reads mapping to the reference or alternate allele at each heterozygous SNP, and collate the counts for genome intervals (e.g., genes or other features).
Please find manuals / worked examples at Wiki page of this repository.
Clone this repository to your local machine. No additional installation needed. Please find the information about tool prerequisites at Wiki page.
Please cite "Unexpected variability of allelic imbalance estimates from RNA sequencing", Mendelevich A., Vinogradova S., Gupta S., Mironov A., Sunyaev S., Gimelbrant A., if you used our pipeline in your work.
Please report bugs to the Github issues page.