RuntimeError: CUDA out of memory. #22

onurbarut · 2023-06-09T17:09:05Z

Hi,

I've tested single gpu test in two different machines, one with a GPU 8GB, another is with A10 24 GB. Both gave me oom error even with num_input_frames = 2. What am I missing?

First one with RTX said:

Tried to allocate 620.00 MiB (GPU 0; 7.80 GiB total capacity; 4.31 GiB already allocated; 412.31 MiB free; 6.33 GiB reserved in total by PyTorch)

while on another machine with A10 said:

Tried to allocate 5.50 GiB (GPU 0; 22.02 GiB total capacity; 10.77 GiB already allocated; 4.78 GiB free; 15.63 GiB reserved in total by PyTorch)

I changed REDS dataset to SRFolderMultipleGTDataset and I've a 2 video subset of REDS4_val which looks like

-REDS4_short/
|-- val_sharp/
|--|-- 000/
|--|--|-- %08d.png
|--|-- 001/
|--|--|-- %08d.png
|-- val_sharp_bicubic/
|--|-- X4/
|--|--|-- 000/
|--|--|--|-- %08d.png
|--|--|-- 001/
|--|--|--|-- %08d.png

And I've used the following command: tools/test.py <path/to/config> <path/to/redsModel>--crf 25 --startIdx 0 --test_frames 50

Let me know if you need any further info to help. Thanks!

The text was updated successfully, but these errors were encountered:

onurbarut · 2023-06-14T13:29:57Z

After debugginf for a while, I've noticed that SPyNet compute_flow() method for the HR frames are failing due to memory. My bet is that for RTX 4000, it's failing at LR frames earlier than HR, so it gives a different amount of memory to allocate.

Looks like SPyNet uses all the frames set by --test_frames parameter to compute the flow, that's where we see OOM. I've set --test_frames 10 and it worked well on A10 (24G). My question here is that, is this behaviour of SPyNet is normal? How to run a video on 1000+ frames? I believe some batching approach should be taken because I'm not interested in first 10 frames, but all frames of the video. Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: CUDA out of memory. #22

RuntimeError: CUDA out of memory. #22

onurbarut commented Jun 9, 2023

onurbarut commented Jun 14, 2023

RuntimeError: CUDA out of memory. #22

RuntimeError: CUDA out of memory. #22

Comments

onurbarut commented Jun 9, 2023

onurbarut commented Jun 14, 2023