-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to trim primers from my interleaved fastq files #800
Comments
You say "merged", but also mention "interleaved", however, these are two different things, so I’m not sure which one you mean. Merging means to me that the two reads from a read pair are merged into a single-end read like this:
On the other hand, "interleaved" just refers to a way to store paired-end reads in a single file. So while you usually have two files, one with the R1 reads and one with the R2 reads, storing paired-end reads in an interleaved manner means that you only have one file where you store the two reads one after the other so that the order of reads in the file is R1, R2, R1, R2 and so on. Judging from the error message, I would guess that you have merged data, so the If it is merged data, you may want to trim your data with a linked adapter (see documentation) using I would also add option If this does not work, please remove the By the way, your Cutadapt version is quite old (2.8); I recommend using a more recent one. |
Ok I ran it with cutadapt version 4.9 and I do finally see some output. This is my code and output:. I did have to remove the ^ and $ symbols for it to work, otherwise all my files were empty. Code: Output:
Not sure if this makes sense, seems like only 7 sequences were removed although previously with the R tutorial it showed that at least 70 reads had a forward primer to it and 68 with reverse. |
Ok I also did the sanity check but the reverse primers are not removed.
So I will try this again because I think my reverse primer needs to be the reverse complement instead (since it's matching to RevComp). But At least the forward primers are removed! |
Minor update, yes changing the REV primer to its REV.Comp worked and now my sanity check output is:
|
Hello. I'm having a problem trying to trim my primers from my interleaved fastq files (R1 and R2 reads have already been merged into each sample file).
I am using Windows Linux Bash Systems, cutadapt (version 2.8) was installed with
sudo
; no issues there. I am running Python3 version 3.8.10.Example of a file name is:
CONTROL_H12_Rep1_ITS1-F_M13F-ITS2_M13R.fastq
. The ITS FWD and REV reads are together here.The primers I use:
Because I was following along with the DADA2 ITS pipeline in R, I had an output for my fastq files that the REV primer was detected as the complement of itself. Example of the output I got:
As such, I changed REV to be the reverse complement, so:
And then I also have the reverse complements for the primers:
And here is the code I used in terminal (because the cutadapt command within the dada2 R script wouldn't work in R for me):
But I get an error message that the reads are improperly paired:
cutadapt: error: Error in sequence file at unknown line: Reads are improperly paired. Name 'VH01754:2:AAFFFT2M5:1:1101:35538:7210 1:N:0:1' (first) does not match 'VH01754:2:AAFFFT2M5:1:1101:50365:12113 1:N:0:1' (second).
But I'm not sure how to fix this, because when I look at the names for the reads they appear only once and all end with "1:N:0:1" so it's almost like it had been converted into a single-end read file, but the sequencing was done as paired end. It's very confusing and I was wondering if you know how this may be remedied to ensure both primers are removed? I have tried treating this as single end but it is unclear if it is removing both the FWD and REV primer complements.
The text was updated successfully, but these errors were encountered: