Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error importing converted vcf #35

Open
z-j-r opened this issue Oct 4, 2019 · 7 comments
Open

Error importing converted vcf #35

z-j-r opened this issue Oct 4, 2019 · 7 comments
Assignees

Comments

@z-j-r
Copy link

z-j-r commented Oct 4, 2019

I successfully converted my vcf file to the tassel format with the supplied python helper script (thanks!), but there seems to be an issue when running Gfull <-calcG().

Error in alleles[iind, ] <- matrix(as.numeric(unlist(strsplit(genosin[[iind + :
number of items to replace is not a multiple of replacement length

Would you have any solutions for this?

Thanks!

@doddsk
Copy link
Contributor

doddsk commented Oct 6, 2019

Thanks for the alert on this issue.
The error occurs when reading the (tassel format) data file (not with calcG). Are you able to share a few lines of your data file to help us sort this issue out?

@z-j-r
Copy link
Author

z-j-r commented Oct 7, 2019

You are correct - the error does appear after reading the tassel format data file source("GBS-Chip-Gmatrix.R") not when running calcG(). I've attached a portion of my data file!

f1v2.vcf.ra.txt

@r-ashby
Copy link

r-ashby commented Oct 9, 2019

Hi, it looks like there is an issue with your RA file. Are you able to please share your VCF file and the pipeline you used to generate the VCF file?

@z-j-r z-j-r closed this as completed Oct 10, 2019
@z-j-r z-j-r reopened this Oct 10, 2019
@z-j-r
Copy link
Author

z-j-r commented Oct 10, 2019

Apologies, I mistakenly closed the issue...

@z-j-r
Copy link
Author

z-j-r commented Oct 10, 2019

I used the GBS_SNP_CROP pipeline and attached my vcf file.
f1.zip

@doddsk
Copy link
Contributor

doddsk commented Oct 16, 2019

Thanks for sending the vcf. We had not realised that the AD field was being used in (at least) 2 different ways. The script was written to expect a pair of read depths (ref,alt) associated with each genotype for this field. (This is how GATK and samtools use this label). We are looking at how much effort might be required to convert this type of vcf.

@rbrauning
Copy link
Contributor

Hi, I've tried reproducing your problem (malformed AD field) in v3.0, v.4.0, v4.1 of GBS-SNP-CROP. The AD field looks fine in all versions and properly reports pairs of read depths (ref, alt), not like in your VCF file. What version of GBS-SNP-CROP did you use? I'm recommending script 8 of v4.0 to produce VCF files as the other versions have some other not AD related problems.
Could it be that you've further processed the output of GBS-SNP-CROP with some other tools? I've noticed some bcftools information in your header.
v4.0 VCF file or GSC.GenoMatrix.txt together with the barcode file would have all the information you need for KGD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants