Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory. #22

Open
onurbarut opened this issue Jun 9, 2023 · 1 comment
Open

RuntimeError: CUDA out of memory. #22

onurbarut opened this issue Jun 9, 2023 · 1 comment

Comments

@onurbarut
Copy link

Hi,

I've tested single gpu test in two different machines, one with a GPU 8GB, another is with A10 24 GB. Both gave me oom error even with num_input_frames = 2. What am I missing?

First one with RTX said:

Tried to allocate 620.00 MiB (GPU 0; 7.80 GiB total capacity; 4.31 GiB already allocated; 412.31 MiB free; 6.33 GiB reserved in total by PyTorch)

while on another machine with A10 said:

Tried to allocate 5.50 GiB (GPU 0; 22.02 GiB total capacity; 10.77 GiB already allocated; 4.78 GiB free; 15.63 GiB reserved in total by PyTorch)

I changed REDS dataset to SRFolderMultipleGTDataset and I've a 2 video subset of REDS4_val which looks like

-REDS4_short/
|-- val_sharp/
|--|-- 000/
|--|--|-- %08d.png
|--|-- 001/
|--|--|-- %08d.png
|-- val_sharp_bicubic/
|--|-- X4/
|--|--|-- 000/
|--|--|--|-- %08d.png
|--|--|-- 001/
|--|--|--|-- %08d.png

And I've used the following command: tools/test.py <path/to/config> <path/to/redsModel>--crf 25 --startIdx 0 --test_frames 50

Let me know if you need any further info to help. Thanks!

@onurbarut
Copy link
Author

After debugginf for a while, I've noticed that SPyNet compute_flow() method for the HR frames are failing due to memory. My bet is that for RTX 4000, it's failing at LR frames earlier than HR, so it gives a different amount of memory to allocate.

Looks like SPyNet uses all the frames set by --test_frames parameter to compute the flow, that's where we see OOM. I've set --test_frames 10 and it worked well on A10 (24G). My question here is that, is this behaviour of SPyNet is normal? How to run a video on 1000+ frames? I believe some batching approach should be taken because I'm not interested in first 10 frames, but all frames of the video. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant