Tensor size issue #1

arunraja-hub · 2024-11-28T08:46:46Z

When I was just trying to run the training using python train.py params_x1x3x4_diffusion_mosesaq_20240824 0, as suggested in the readme, I got the following error:

RuntimeError: Trying to resize storage that is not resizable

According to lucidrains/denoising-diffusion-pytorch#248 the solution is to change num_workers in the dataloader to 0 but that resulted in the following error:

RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 1176 but got size 595 for tensor number 1 in the list.

Could you please provide some guidance on this?

The text was updated successfully, but these errors were encountered:

keiradams · 2024-11-28T15:48:54Z

Hi! I have not experienced this error, so I suspect it has something to do with our different training setups or package versions.

To help debug, can you try the following:

Make sure you can successfully run inference code provided by the RUNME_{}.ipynb notebooks.
In train.py, make sure you can call dataset[0] after initializing dataset = HeteroDatset(...)
In train.py, make sure you can call next(iter(train_loader)) after initializing train_loader = torch_geometric.loader.DataLoader(...), with batch_size = 0 and batch_size > 0.

If all of that works, then I would guess it is related to an issue with DDPM in Pytorch-Lightning with your particular system set-up. Are you trying to train with 1 GPU? On a CPU? On multiple GPUs? The parameters in parameters/params_x1x3x4_diffusion_mosesaq_20240824.py specify 'num_gpus': 2 and 'multiprocessing_spawn': True. Both of those could be causing issues with your specific setup?

Also, does this error occur at the start of the training epochs? Or mid-way through training?

Additionally, make sure that the versions of your packages are the same as those listed in the README, particularly your Pytorch-Lightning, Pytorch, and PyG versions.

It would also help if you could provide the complete error traceback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor size issue #1

Tensor size issue #1

arunraja-hub commented Nov 28, 2024

keiradams commented Nov 28, 2024

Tensor size issue #1

Tensor size issue #1

Comments

arunraja-hub commented Nov 28, 2024

keiradams commented Nov 28, 2024