You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run mu.tl.mofa() as part of panpipes with a dataset of ~100k cells, and two modalities (RNA and ATAC). I am subsetting to HVFs so I have about 8k features in RNA and 25k in ATAC.
The model starts running but before the first iteration
######################################
## Training the model with seed 1 ##
######################################
ELBO before training: -18558515111.51
CUPY errors with memory allocation:
Traceback (most recent call last):
File "/batch_correct_mofa.py", line 172, in <module>
mu.tl.mofa(tmp, **mofa_kwargs)
File "[dir]/lib/python3.10/site-packages/muon/_core/tools.py", line 588, in mofa
ent.run()
File "[dir]/lib/python3.10/site-packages/mofapy2/run/entry_point.py", line 57, in saver
func(self, *args, **kwargs)
File "[dir]/lib/python3.10/site-packages/mofapy2/run/entry_point.py", line 1434, in run
train_model(self.model)
File "[dir]/lib/python3.10/site-packages/mofapy2/build_model/train_model.py", line 27, in train_model
model.iterate()
File "[dir]/lib/python3.10/site-packages/mofapy2/core/BayesNet.py", line 291, in iterate
self.nodes[node].update()
File "[dir]/lib/python3.10/site-packages/mofapy2/core/nodes/multiview_nodes.py", line 136, in update
self.nodes[m].updateParameters(ix, ro)
File "[dir]/lib/python3.10/site-packages/mofapy2/core/nodes/W_nodes.py", line 204, in updateParameters
self._updateParameters(
File "[dir]/lib/python3.10/site-packages/mofapy2/core/nodes/W_nodes.py", line 249, in _updateParameters
tauY_gpu = (tau_gpu * gpu_utils.array(Y)).T
File "cupy/_core/core.pyx", line 1285, in cupy._core.core._ndarray_base.__mul__
File "cupy/_core/_kernel.pyx", line 1350, in cupy._core._kernel.ufunc.__call__
File "cupy/_core/_kernel.pyx", line 645, in cupy._core._kernel._get_out_args_from_optionals
File "cupy/_core/core.pyx", line 2811, in cupy._core.core._ndarray_init
File "cupy/_core/core.pyx", line 241, in cupy._core.core._ndarray_base._init_fast
File "cupy/cuda/memory.pyx", line 738, in cupy.cuda.memory.alloc
File "cupy/cuda/memory.pyx", line 1424, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1445, in cupy.cuda.memory.MemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1116, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
File "cupy/cuda/memory.pyx", line 1137, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
File "cupy/cuda/memory.pyx", line 1382, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
File "cupy/cuda/memory.pyx", line 1385, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 16,604,200,448 bytes (allocated so far: 33,221,685,248 bytes).
OS: HPC
Python version 3.10
Versions of libraries involved muon 0.1.5
The text was updated successfully, but these errors were encountered:
Let's discuss it in bioFAM/mofapy2#32 but if there's not enough space on GPU to fit the whole 100k x (8k + 25k) Y matrix, you should try Stochastic Variational Inference (svi_mode).
That being said, there are also definitely places in the training code where more care could be taken when allocating GPU memory.
I am trying to run
mu.tl.mofa()
as part of panpipes with a dataset of ~100k cells, and two modalities (RNA and ATAC). I am subsetting to HVFs so I have about 8k features in RNA and 25k in ATAC.The model starts running but before the first iteration
CUPY errors with memory allocation:
muon 0.1.5
The text was updated successfully, but these errors were encountered: