-
Notifications
You must be signed in to change notification settings - Fork 24
Trimmomatic
Qsub to the acf, then get an interactive session with a computational node.
Check that you are on a compute node.
uname -a
Go to your personal project folder in the class directory. You should have an e_coli folder with the following structure.
New dir. Keep it clean and organized. Start from the analysis folder.
mkdir 2_trimmomatic
cd 2_trimmomatic
We'll need to symbolically link to our reads again
ln -fs ../../../../e_coli_data/*gz .
ls -l
Software links:
While the ACF system has Trimmomatic available as a module, its a bit out of date. So I put an up-to-date copy for us to use in our class project directory. We'll need to use the full path to call this program.
Now lets run trimmomatic for both adapter removal and quality on our first pair of reads:
java -jar /lustre/haven/proj/UTK0138/software/software/Trimmomatic-0.39/trimmomatic-0.39.jar PE \
../../raw_data/DRR021342_1.fastq \
../../raw_data/DRR021342_2.fastq \
DRR021342_1.trimmed.paired.fastq \
DRR021342_1.trimmed.unpaired.fastq \
DRR021342_2.trimmed.paired.fastq \
DRR021342_2.trimmed.unpaired.fastq \
ILLUMINACLIP:NexteraPE-PE.fa:2:30:10:2:keepBothReads SLIDINGWINDOW:4:15 MINLEN:36
Lets break that command down:
trimmomatic PE # Paired end mode
../raw_data/DRR021342_1.fastq # first read
../raw_data/DRR021342_2.fastq # second read
DRR021342_1.trimmed.paired.fastq # output - first reads, part of pair, trimmed
DRR021342_1.trimmed.unpaired.fastq # output - first reads, not paired, trimmed
DRR021342_2.trimmed.paired.fastq # output - second reads, part of pair, trimmed
DRR021342_2.trimmed.unpaired.fastq # output - second reads, not paired, trimmed
ILLUMINACLIP:/data/apps/trimmomatic/0.36/adapters/TruSeq3-PE.fa:2:30:10 # adapter file
LEADING:3 # trim bases at the end of the read if they are less than quality value of 3
TRAILING:3 # trim bases at the beginning of the read if they are less than quality value of 3
SLIDINGWINDOW:4:15 # trim bases if the average quality over a 4 base window is less than 15
MINLEN:36 # discard reads if they are less than 36 bases
So what did Trimmomatic do? What are some ways you can tell?
- count the sequences before and after
- look at the file size before and after
- rerun fastQC and look at the detailed report (this is your homework!)