Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mapper: Initial mappings (FASTQ -> BAM) #3

Open
eweitz opened this issue Aug 3, 2015 · 5 comments
Open

Mapper: Initial mappings (FASTQ -> BAM) #3

eweitz opened this issue Aug 3, 2015 · 5 comments

Comments

@eweitz
Copy link
Collaborator

eweitz commented Aug 3, 2015

In the Mapper module, support taking FASTQ input and converting to BAM output.

@chris-owen, @dauss75, this is one of the issues we discussed.

@chris-owen
Copy link
Collaborator

Segun and I produced this pipe using BWA mem and Samtools to map a PE fastq to the reference and produce a bam file that is sorted with duplicates removed.

bwa mem <reference_genome_fasta> <left_reads.fastq> <right_reads.fastq> | samtools view -Shu - | samtools sort - - | samtools rmdup -S - <PE_Mapped_DupRm.bam>

@dauss75
Copy link
Contributor

dauss75 commented Aug 4, 2015

snakemake is installed and currently putting some stuff in a test script to work with bwa and freebayes

@CarlosBorroto
Copy link

@chris-owen this looks really good, the only thing is I realized samtools rmdup will remove already marked duplicate reads. At least as far as I can tell. This means you need to mark duplicates first. The most common tool for this is picard's MarkDuplicates. However more recently I've seen samblaster as a better option as it is piping friendlier which picard is not.

Samblaster recommends doing something like:
bwa mem <idxbase> samp.r1.fq samp.r2.fq | samblaster | samtools view -Sb - > samp.out.bam

You should be able to integrate this into the command above pretty easily.

@CarlosBorroto
Copy link

Oh, and depending on the caller you might not need to remove duplicates just mark them. Callers these days are smart enough to ignore duplicate reads that has been marked as such.

@chris-owen
Copy link
Collaborator

Segun and I just met with Carlos and he confirmed the pipe commands above are sufficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants