MaskGraphene: Advancing joint embedding, clustering, and batch correction for spatial transcriptomics using graph-based self-supervised learning
Implementation for Recomb-seq 2024 paper: MaskGraphene.
Please refer to this page. We hope this could help to better explore and investigate this tool.
- Python >= 3.9
- Pytorch == 2.0.1
- anndata==0.9.2
- h5py==3.9.0
- hnswlib==0.7.0
- igraph==0.10.8
- matplotlib==3.6.3
- paste-bio==1.4.0
- POT==0.9.1
- rpy2==3.5.14
- scanpy==1.9.1
- umap-learn==0.5.4
- wandb
- pyyaml == 5.4.1
conda create -n MaskGraphene python=3.9
conda activate MG
git clone https://github.com/OliiverHu/MaskGraphene.git
pip install -r requirements.txt
For DGL package, please refer to link
pip install dgl -f https://data.dgl.ai/wheels/cu117/repo.html
pip install dglgo -f https://data.dgl.ai/wheels-test/repo.html
For quick start, you could run the scripts:
mouse Hypothalamus -0.19/-0.24 generate hard-links
python ../localMG_main.py --max_epoch 3000 --max_epoch_triplet 1000 --logging False --section_ids " -0.19,-0.24" --num_class 8 --load_model False --num_hidden "512,32"
--exp_fig_dir "./" --h5ad_save_dir "./" --st_data_dir "./" --alpha_l 3 --lam 1 --loss_fn "sce" --mask_rate 0.50 --in_drop 0 --attn_drop 0 --remask_rate 0.50
--seeds 2023 --num_remasking 1 --hvgs 0 --dataset mHypothalamus --consecutive_prior 1 --lr 0.001
mouse Hypothalamus -0.19/-0.24
python ../maskgraphene_main.py --max_epoch 3000 --max_epoch_triplet 1000 --logging False --section_ids " -0.19,-0.24" --num_class 8 --load_model False --num_hidden "512,32"
--exp_fig_dir "./" --h5ad_save_dir "./" --st_data_dir "./" --alpha_l 3 --lam 1 --loss_fn "sce" --mask_rate 0.50 --in_drop 0 --attn_drop 0 --remask_rate 0.50
--seeds 2023 --num_remasking 1 --hvgs 0 --dataset mHypothalamus --consecutive_prior 1 --lr 0.001
DLPFC 151507/151508 generate hard-links
python ../localMG_main.py --max_epoch 2000 --max_epoch_triplet 500 --logging False --section_ids "151507,151508" --num_class 7 --load_model False --num_hidden "512,32"
--exp_fig_dir "./" --h5ad_save_dir "./" --st_data_dir "./" --alpha_l 1 --lam 1 --loss_fn "sce" --mask_rate 0.5 --in_drop 0 --attn_drop 0 --remask_rate 0.1
--seeds 2023 --num_remasking 1 --hvgs 3000 --dataset DLPFC --consecutive_prior 1 --lr 0.001
DLPFC 151507/151508
python ../maskgraphene_main.py --max_epoch 2000 --max_epoch_triplet 500 --logging False --section_ids "151507,151508" --num_class 7 --load_model False --num_hidden "512,32"
--exp_fig_dir "./" --h5ad_save_dir "./" --st_data_dir "./" --alpha_l 1 --lam 1 --loss_fn "sce" --mask_rate 0.5 --in_drop 0 --attn_drop 0 --remask_rate 0.1
--seeds 2023 --num_remasking 1 --hvgs 3000 --dataset DLPFC --consecutive_prior 1 --lr 0.001
Supported ST datasets:
- 10x Visium:
DLPFC
,Mouse Sagittal Brain
- Others:
mouse Hypothalamus
,Embryo
Tutorial 1: hard-links generation
Currently under review