Skip to content

Latest commit

 

History

History
68 lines (46 loc) · 2.62 KB

README.md

File metadata and controls

68 lines (46 loc) · 2.62 KB

STRchive

("ess-tee-archive")

Short Tandem Repeat disease loci resource

⭐️ View the data at strchive.org ⭐️

STRchive by Harriet Dashnow is licensed under CC BY 4.0

Contributors

  • Harriet Dashnow
  • Laurel Hiatt
  • Akshay Avvaru
  • Vincent Rubinetti

Contributing

If you notice an error, omission, or update, feel free to leave a comment or create a pull request.

To make a change to the STRchive data itself, please edit data/STRchive-loci.json

Then run the "linting" script and fix any errors:
python scripts/check-loci.py data/STRchive-loci.json

Development

Run all scripts to update STRchive

From the root directory, run:
snakemake

Or to skip retrieve and manubot stages, which will speed things up substantially:
snakemake --config stages="skip-refs"

Update TRGT genotyping catalogs

python scripts/make-catalog.py -g hg38 -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.hg38.TRGT.bed
python scripts/make-catalog.py -g T2T -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.T2T-chm13.TRGT.bed
python scripts/make-catalog.py -g hg19 -f TRGT data/STRchive-loci.json data/STRchive-disease-loci.hg19.TRGT.bed

Update extended BED files

python scripts/make-catalog.py -f bed -g hg38 data/STRchive-loci.json data/STRchive-disease-loci.hg38.bed
python scripts/make-catalog.py -f bed -g T2T data/STRchive-loci.json data/STRchive-disease-loci.T2T-chm13.bed
python scripts/make-catalog.py -f bed -g hg19 data/STRchive-loci.json data/STRchive-disease-loci.hg19.bed

Install dependencies

New install:

conda env create --file scripts/environment.yml
conda activate strchive

Update existing installation:

conda activate strchive
conda env update --file scripts/environment.yml --prune
conda activate strchive

Note: biomaRt isn't playing nicely with conda, so installing it within the R script where it is used.