-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] non-stranded and stranded RRBS library #59
Comments
Hi @zmz1988, The strandedness (or non-strandedness) of a dataset comes from the library preparation method used to create the data. In WGBS (and similarly for RRBS), there are four possible strands that a read can come from: the original top or bottom strands and the complements to those strands (see the introduction to the Dupsifter paper for an overview of how these strands come to be). In a traditional stranded library (like data from the NEB EM-seq kit or the Swift Accel-NGS kit), read 1 comes from the original strands and read 2 comes from the complements (as in your second image). In a PBAT-derived library (like from the original Miura and Ito PBAT method), read 1 comes from the complements and read 2 comes from the original strands. On the other hand, in a non-stranded (or non-directional) library, read 1 can come from any of the four strands and read 2 will come from its complement. For example, read 1 may come from the CTOT (complement to the original top) strand and read 2 from the OT (original top) strand. Or, read 1 may come from the OB (original bottom) and read 2 may come from the CTOB (complement to the original bottom) strand. If you know your library is stranded (based on the protocol used), then you can use the If you are unsure of the strandedness of your library, you can align the first 10,000 or so reads in your FASTQs without the
This was a lot of information, so feel free to follow up if anything needs clarification! |
Thank you so much @jamorrison for taking the time answering my question! I really appreciate it! I got all the things you explained, and it is very clear. Thanks! The only thing I'm confused now is that I somewhat know how the company prepared my RRBS library, as the protocol they shared includes a PCR amplification step in the end. But the alignment data still hints to directional library (read 1 to OT/OB + read 2 to CTOT/CTOB beyond 80% of the total reads). May I take the chance asking whether you know how this could happen? Thanks a lot in advance! |
It's possible that a PCR amplification step in the end could influence directionality, but it's more dependent on the primers that are used and other things upstream of the amplification. Based on the distribution of reads that you're seeing, I would presume that you have a directional library, but you could send a subset of reads (10,000-100,000) through Bismark and look at the strand distribution that is output to quickly confirm it as well. |
Yes, I will run through Bismarck as well. I was a bit not sure before how read 1 can be aligned to CTOT/CTOB and read 2 to OT/OB. 😊 thank you so much for all your answers! It is really helpful! |
Glad I could help! I may have missed this in the your previous response, but if read 1 is aligning to the CTOT/CTOB strand and read 2 to OT/OB, then you likely have a PBAT library. If you have a PBAT library and run with
(see my first response for the explanation of why this is) |
Hello! I'm using BISCUIT at the moment for my RRBS data. I have a question about when to use stranded and non-stranded alignment.
In the example you show in the mapping quality control, you did not use
-b 1
option at the beginning, and the result showed as below.After you applied the
-b 1
, then the result showed as below.In this case, can I understand that using
-b 1
makes the alignment better, because more reads can be aligned? So in this case, can I say that my library is stranded? Also, in the case, shall I keep with-b 1
option?On the contrary, if after applying
-b 1
option, the number of aligned reads decrease, then can I say that my library is non-stranded, and I shall use without-b 1
option?Thank you very much in advance!
The text was updated successfully, but these errors were encountered: