Code for the Brassica oleracea/rapa/napus genomic comparison.
Most of these folders are RStudio projects that can be opened in RStudio.
This folder collects small scripts and documentation as to how I ran the assembly and annotation steps for the pangenome.
This contains the PAV tables and some code to filter them to be species-specific and to remove 0 individual genes.
This folder contains scripts to compare gene content between the pangenomes
R-code for pangenome size modeling
R-code for the Venn-diagram plots
R-code for the PCA plots and animations
Code to compare the B. rapa FPSc individuals with the rest lives in this folder
R-code for the stacked barplots for the R-genes
Code to visualise differences in networked genes
PAV modeling scripts
Code and RStudio project that checks which GO terms are enriched
More PAV visualisation scripts
Just some data with coverages per individual
R-code for the variable genes vs core genes
Code to find mutually incompatible genes
R-code to replot SHAP plots with defaults and values I like, one line per chromosome, a nicer grid and so on
A nicer recode of older code that runs chisquare for a bunch of comparison tables around variable/core genes in protein-protein interaction networks.
R-code that pulls out coverage and other stats from SRA for the individuals I used.
These contain the XGBoost and Shapley values work used for modeling gene presence/absence variation.
You can launch them on binder:
Or navigate to the notebooks
folder.