TimemaTranscriptome

This repository was created for scripts counting synonimous and non-synonimous SNPs, furher analysis will be probably added as well.

wraper scripts

snp_parser.sh - calls degeneratedSites.py and snp2codonVariants.py on data in the path give by first argument and caputure outputs at the same place, these folders with input and output will be linked to ./data directory

computing scripts (called for analyses)

./scripts/degeneratedSites.py - reads a genetic code and transcriptome, printing a list of seq_name f1 f2 f3 f4, where fn are numbers of n-fold degenerated sites

./scripts/snp2codonVariants.py - reads a snp table and transcriptome, returning a list of seq_name ref_codon alt_codon transcript_length

./scripts/SynSubsFreq.R - counts the number of synonimous and nonsynonimous snp per species; this script whould be modified if the s/n snps should be computed per transcript

precomputing scripts (called once to create a file)

./scripts/genetic_code_parser.r - reads a "./genetic_code.csv", saving to "genetic_code_stats.csv", codon, aa, number of n-folded sites per codon, possible synonimous changes of 1st, 2nd and 3rd position, this might be useful in the case that the non-standart genetic code will be used.

data

./data/genetic_code.csv - general genetic code downloaded from http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/index.cgi?chapter=tgencodes#SG1 and parsed into three column format; for another genetic code, just reformat it it the same way as this genetic_code is, replace it and delete _stats.csv file; stats will be computed automatically

./data/genetic_code_stats.csv - genetic code with computed stats: f1-f4 n-fold degenerates sites per codon, and number of theoretical synonimous substitutions per first, second or third position per codon.

INPUT

two folders are expected in the same direcotry:

"filtered_snps" containing the snp ouput of varscan with naming "species.filtered.snps" (eg. Tce.filtered.snps)
"orfs" containing the CDS fasta files from which snps where mined. Naming should be "species.cds.fa" (eg. Tce.cds.fa)

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
scripts		scripts
test_data		test_data
README.md		README.md
script_test.py		script_test.py
snp_parser.sh		snp_parser.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TimemaTranscriptome

wraper scripts

computing scripts (called for analyses)

precomputing scripts (called once to create a file)

data

INPUT

About

Releases

Packages

Contributors 2

Languages

AsexGenomeEvol/pNpS

Folders and files

Latest commit

History

Repository files navigation

TimemaTranscriptome

wraper scripts

computing scripts (called for analyses)

precomputing scripts (called once to create a file)

data

INPUT

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages