-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors when running PAQR step2 #19
Comments
By the way, The data I used in the code is from https://www.encodeproject.org/experiments/ENCSR000CWC/ (ENCFF391BFB.bam, ENCFF983OXY.bam). Also, the config.yaml setting is shown in the attached figure. In addition, I have put the problem into the github issue. Finally, the attached text file is the reference used in my config.yaml. clusters.mm10.vM22.no_tsl.atlas2.canonical_chr.tandem.noOverlap_strand_specific.txt full_transcripts.mm10.vM22.no_tsl.atlas2.canonical_chr.tandem.noOverlap_strand_specific.txt |
Hi can you please post the output of If this is really the reason for the error, you could mitigate it by lowering the "bias.median.cutoff" parameter in the config file. Rgds |
Ok, An easy workaround would be to adjust your config file as follows: ctl_rep1: {bam: ENCFF391BFB, type: CNTRL}
HNRNPC_rep1: {bam: ENCFF983OXY, type: KD, control: ctl_rep1} In this setting, running PAQR would result in the estimation of poly(A) site usage for both samples independently. Hope this helps! |
Oh, thanks so much! Actually, The two bams are bio-replicates. Would this setting not affect results? |
It affects the results slightly, because PAQR would normally take advantage of the two samples being replicates. But it should not be an issue. Please bear in mind, though, that running KAPAC requires the comparison across conditions - in your case you would need a reference condition to compare against. |
Got it, thanks! |
Thanks for your suggestions because I get the PAQR results successfully. However, facing so many results, I am confused about the difference of 3 files (relative_usages.filtered.tsv, relative_usages.tsv, and relative_usages.relPos_per_pA.out) and wonder which one is the proximal polyA site usage. Could you please explain it to me for I have not found their interpretation in the manual. Many thanks! Qin |
Hi Qin there are two files which are worth looking into if you want to obtain the proximal poly(A) site usage:
Both files contain one poly(A) site per line and poly(A) sites are grouped by exons and sorted proximal to distal -> so each line that has a new entry in the exon column indicates a proximal poly(A) site. Hope that helps Best |
Thanks for your answer which clears my doubts. There is another 2 question which interest me are: 1. what is used to filter 'relative_usages.tsv' to get 'relative_usages.filtered.tsv'? 2. When applying PAQR to analyze forebrain E13.5 day polyA RNA-Seq data, I find that relative usage of proximal polyA sites in most of genes in |
The filtering simply selects exons with multiple poly(A) sites which have read support in all samples - so this can be considered more of a data cleaning step. |
Hi,
Thanks for developing PAQR_KAPAC. I am facing a problem when running step2 of PAQR with the source code (error shown in the attached figure).
Does anyone have idea from which the error comes? Many Thanks!
The text was updated successfully, but these errors were encountered: