Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract Methylation and Mutation Information directly from the vcf file of GemBS? #58

Open
Chao-Guo-hub opened this issue Nov 29, 2024 · 1 comment

Comments

@Chao-Guo-hub
Copy link

The latest ENCODE WGBS pipeline uses GemBS as their upstream processing tool, can I use biscuit vcf2bed to extract methylation and mutation information directly from the vcf file it produces? Or do I need to start with a further bam file?

@jamorrison
Copy link

Hi @Chao-Guo-hub,

biscuit vcf2bed uses a variety of information from the FORMAT and INFO tags in order to extract methylation and mutation information from a VCF file. It's possible the gemBS VCF includes these (see below for specific tags), but I'm guessing they probably aren't included. If that's the case, you will need to start with the gemBS BAM file and run these through a duplicate marking tool (like dupsifter since the gemBS BAM hasn't been duplicate marked), then biscuit pileup and biscuit vcf2bed.

To extract methylation information, vcf2bed needs the BT and CV FORMAT tags and the CX and N5 INFO tags. For mutation information, the GT, SP, AC, and AF1 FORMAT tags

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants