-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does the snp vcf files need a L ( SNP size ) parameter #46
Comments
Hi chichi,
RAiSD can handle this file size (I have tested it with input files of up to
65GB).
The error you get means that not all SNPs in the same chromosome have the
same length. RAiSD reads the expected SNP length from the very first SNP in
each chromosome.
Best regards,
Nikos A.
…On Thu, Dec 14, 2023 at 4:20 AM chichi ***@***.***> wrote:
*HI! Alachins*
I got a problem, when I deal with my snp vcf data.
The question
When I am trying to use RAISD to detect the Sweep and positive selection
sites with the population snp vcf file, which produced by the GATK
pipleline, it give the following report. I have read the readme file
carefully, while I still fail to deal with it. So would you please help me
figure out what is the problem? many thanks to that
some information
The vcf file is kind of large ~ 6 GB without zip.
It contains 111 samples.
It contains 15 Chromesomes (start with Chr01) and 2 contigs ( congtig01 )
my guess
Is this file too large for handle ? Yes, it is too confused for the hint
information, so I leave this communt for you. Stilling working on it, thank
for you response.
*best* ~
*chichi*
the output information
RAiSD, Raised Accuracy in Sweep Detection
This is version 2.9 (released in August 2020)
Copyright (C) 2017, and GNU GPL'd, by Nikolaos Alachiotis and Pavlos
Pavlidis
Contact n.alachiotis/pavlidisp at gmail.com
Command: /home/chichi/softwares/RAiSD/RAiSD -n test -I
../data/merge3_filter_variants_snp.vcf -f
Samples: 111
Format: vcf
var-exp: 1.0
sfs-exp: 1.0
ld-exp: 1.0
A pattern structure of 349525 patterns (max. capacity) and approx. 16 MB
memory footprint has been created.
The pattern structure has been resized to 209715 patterns (max. capacity)
and approx. 16 MB memory footprint.
ERROR: Wrong SNP size (L) found!
—
Reply to this email directly, view it on GitHub
<#46>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AALKWCXOJ4VHNUT3RW3SGUTYJJV6LAVCNFSM6AAAAABAUFEH6KVHI2DSMVQWIX3LMV43ASLTON2WKOZSGA2DAOBTGA3DIMQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Nikolaos Alachiotis
|
Hi ! alachins,
and for the snps, it should like the following one , which is one ref and one allels for all samples
the file are extra for the GATK calling, my idea for the following steps is try to keep the ideal snps in the vcf file for raisd analysis. Maybe it works well. |
Hi ! alachins,
|
HI! Alachins
I got a problem, when I deal with my snp vcf data.
The question
When I am trying to use RAISD to detect the Sweep and positive selection sites with the population snp vcf file, which produced by the GATK pipleline, it give the following report. I have read the readme file carefully, while I still fail to deal with it. So would you please help me figure out what is the problem? many thanks to that
some information
The vcf file is kind of large ~ 6 GB without zip.
It contains 111 samples.
It contains 15 Chromesomes (start with Chr01) and 2 contigs ( congtig01 )
my guess
Is this file too large for handle ? Yes, it is too confused for the hint information, so I leave this communt for you. Stilling working on it, thank for you response.
best ~
chichi
the output information
The text was updated successfully, but these errors were encountered: