Reduce memory load #157

fe4960 · 2023-05-10T22:12:07Z

fe4960
May 10, 2023

Hello,

Thanks for developing this great software. I have a question regarding the memory size requirement. I have un-paired snATAC-seq and snRNA-seq data. Peaks from snATAC were called with ArchR. I successfully generated a cistopic object as well as a scenic plus object with meta-cells from the two modality. I ended up with a scenic plus object with 2000 cell X 35K genes and 2000 cell X 500K peaks and a 32-topic model. Since ray parallel computation seems not working well in our cluster server, I only used 1 cpu to run that. But I got stuck in GSEA:

"2023-05-08 12:19:37,017 GSEA INFO Thresholding region to gene relationships
2023-05-08 20:27:34,733 GSEA INFO Subsetting TF2G adjacencies for TF with motif.
2023-05-08 20:29:01,421 GSEA INFO Running GSEA..."

The job cannot run through and seems to require hundreds GB of memory, which made the server node down.

I saw the previous discussion that you suggest to filter genes to reduce the number of genes. I wonder for the peaks, should I also reduce peak number? What is the best practice to reduce peak number?

Also do you know if I install the development branch of SCENIC+ with joblib and increase cpu number, would it possible to run through a scenic plus object with 2000 cell X 25K genes and 2000 cell X 650K peaks ? Will the memory requirement be spread among multiple cpus and make each cpu have relatively small memory load?

Thank you very much!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory load #157

{{title}}

Replies: 0 comments

Select a reply

Reduce memory load #157

fe4960 May 10, 2023

Replies: 0 comments

fe4960
May 10, 2023