Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality filtering of NanoPore reads #25

Open
mihuber opened this issue May 29, 2017 · 3 comments
Open

Quality filtering of NanoPore reads #25

mihuber opened this issue May 29, 2017 · 3 comments
Assignees
Milestone

Comments

@mihuber
Copy link
Member

mihuber commented May 29, 2017

Disable length trimming based on quality for NanoPore reads. NanoPore quality scores do not follow PHRED scores.

@2b89169b-74c9-4653-8877-c3f21ec7674d runid=2564fad3a9906330fcc0c6c90ac0c8128b89d9ff read=213 ch=124 start_time=2017-05-24T11:01:53Z
TTGTTCGTTCAGTTGGGTGTTTTATGGTTTCGTTTTTCGTGCGCCGCTTCAACAGATGAAGATGTGACATCCATTATTAATAAAGTAGACAGGCATGCTGTGGTCAAAATGGCAGTTTGTGGTAATATAGCACCATATAGCAAGAATTCTAACAAAGTTTTAACTGTAATTTAACTAATCATTATCAAATCTCTGTAGCTGCTGAAGAGAAAAGAAAAGTGGTACACTTCTGCAAATGAGACAAATCCTGTTCATCGCCATCGACAGCATGGTTGAAAAAACTCTTTGATGGTTGTTGCATTTGGAGTGCTCTGTAGTTGACATTAAACCAGCCGTCAAAAGAATTGGCCTATTAAGCCAATGGCTTGATGTTGACTAAGAGTGTAGGAGCAGC
+
$$$')7',+*$&%(+2/3/2=,)''.+)*.,,0:830,+612033).95:8*1239BDCA2461=;<A?=B:268184?7>?@928+(63+.126>[email protected]=:4*5134;F5/3,+'5*-.926:3:B;C<9..238./53;;11)249*.1DEA1?2/),(--74-++'128;1E0-.27E3.++)'(,17876355<.*,'-880'/46.,6.7570/0B1D?8202--,*-0&01*)?1..EFFBE@3254+/+,+34666=8:0/;EFFGEBD5>=/++1+35@,=A35.68++2)(+'),)213.369;>4*/+<?>=856?7>B?9C7.(.*5/245?//7:=8544.,.23/2*+(.*@.+*2(0/&*+/*))+0-,-*)+
@ozagordi
Copy link
Collaborator

What format do they follow then? What is a good suggestion to filter NanoPore reads, if they should be filtered at all?

@mihuber
Copy link
Member Author

mihuber commented May 30, 2017

NanoPore quality scores:
The Phred quality score defines the quality of each base in the sequence, with values from from 0 to 93. The score is calculated as: Quality score = -10 x log(Pe) where Pe is the estimated error probability for each base. For example, an error of 1 in 100 will give a q-score of 20. The q-scores are then encoded in the Sanger format using ASCII, with values of 33 to 126. The quality is then shown as a single character per base.
https://community.nanoporetech.com/technical_documents/data-analysis/v/datd_5000_v1_reve_22aug2016/basecalled-fast5-files

@ozagordi ozagordi added this to the 2.0 milestone Feb 13, 2018
@ozagordi
Copy link
Collaborator

I would unassign me from this. Any volunteer taking the responsibility for this issue?

@mihuber mihuber assigned mihuber and MaryamZaheri and unassigned ozagordi Oct 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants