Check prs scoring files (for example, files from PGS Catalog) for errors:
script preprocess pgs/preprocess_prs_file.py
,
see --help for help with arguments,
'PGS001833.txt' example in data/
.
Check your vcf file for SNPs IDs - rsIDs. To get more info to your vcf:
- VCF files can be annotated with oakvar
- VCF files can be annotated with bcftools, source of annotation can be selected from NCBI Human Variation Sets in VCF Format (ClinVar, dbSNP)
See launch example inpreprocess/annotate_vcf.sh
Check header of script calc_and_visualize/calc_prs.py
: file paths to vcf and prs scoring file must be specified, to use the optional enable/disable some SNPs, the paths to these lists can be set.
To draw bullet plot fill the data (custom_snp_number, overall_snp_number variables) in header calc_and_visualize/draw_bullet_plot.py
and run.
We can also perform additional check with plink.
- Get plink binary files by running the script
preprocess/plink convert/get_plink_fileset_bin.sh
.
If get into troubles with getting binary fileset, some intermediate steps may help. Seeplink/get_plink_fileset_pgen.sh
,plink/pgen_to_bed.sh
, or just try to fix initial .vcf file. - Run
calc_and_visualize/run_plink_prs.sh
- Get plink pfiles by running the script
preprocess/plink convert/get_plink_fileset_pgen.sh
. - Run
calc_and_visualize/run_plink2_prs.sh
plink/run_plink_prs.sh