Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA out of memory #17

Open
nanpuhaha opened this issue Jan 19, 2023 · 0 comments
Open

RuntimeError: CUDA out of memory #17

nanpuhaha opened this issue Jan 19, 2023 · 0 comments

Comments

@nanpuhaha
Copy link
Contributor

01/19 22:48:28 - mmengine - INFO - Checkpoints will be saved to /opt/ml/final-project-level3-cv-17/mmyolo/work_dirs/yolov7_l_syncbn_fast_8x16b-300e_coco.
Traceback (most recent call last):
  File "tools/train.py", line 116, in <module>
    main()
  File "tools/train.py", line 112, in main
    runner.train()
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1678, in train
    model = self.train_loop.run()  # type: ignore
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/runner/loops.py", line 90, in run
    self.run_epoch()
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/runner/loops.py", line 106, in run_epoch
    self.run_iter(idx, data_batch)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/runner/loops.py", line 122, in run_iter
    outputs = self.runner.model.train_step(
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 114, in train_step
    losses = self._run_forward(data, mode='loss')  # type: ignore
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmengine/model/base_model/base_model.py", line 314, in _run_forward
    results = self(**data, mode=mode)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 92, in forward
    return self.loss(inputs, data_samples)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmdet/models/detectors/single_stage.py", line 77, in loss
    x = self.extract_feat(batch_inputs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmdet/models/detectors/single_stage.py", line 143, in extract_feat
    x = self.backbone(batch_inputs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/ml/final-project-level3-cv-17/mmyolo/mmyolo/models/backbones/base_backbone.py", line 221, in forward
    x = layer(x)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/container.py", line 141, in forward
    input = module(input)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/ml/final-project-level3-cv-17/mmyolo/mmyolo/models/layers/yolo_bricks.py", line 714, in forward
    x_short = self.short_conv(x)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/mmcv/cnn/bricks/conv_module.py", line 207, in forward
    x = self.conv(x)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 446, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/opt/conda/envs/final2/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 442, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: CUDA out of memory. Tried to allocate 508.00 MiB (GPU 0; 31.75 GiB total capacity; 27.35 GiB already allocated; 50.50 MiB free; 27.74 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant