You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
creating data loader...
creating model and diffusion...
training...
Traceback (most recent call last):
File "scripts/segmentation_train.py", line 118, in
main()
File "scripts/segmentation_train.py", line 70, in main
TrainLoop(
File "D:\MedSegDiff-master.\guided_diffusion\train_util.py", line 83, in init
self._load_and_sync_parameters()
File "D:\MedSegDiff-master.\guided_diffusion\train_util.py", line 139, in _load_and_sync_parameters
dist_util.sync_params(self.model.parameters())
File "D:\MedSegDiff-master.\guided_diffusion\dist_util.py", line 111, in sync_params
dist.broadcast(p, 0)
File "D:\Anaconda\envs\sg\lib\site-packages\torch\distributed\distributed_c10d.py", line 1195, in broadcast
work.wait()
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
Plz!Tell me where the problem lies? and why? Hope someone nice can help me~ Thanks!!!
The text was updated successfully, but these errors were encountered:
Try add p = p + 0 in the sync_params function within dist_util.py as follows:
def sync_params(params):
"""
Synchronize a sequence of tensors across ranks from rank 0.
"""
for p in params:
with th.no_grad():
p = p + 0
dist.broadcast(p, 0)
Try add p = p + 0 in the sync_params function within dist_util.py as follows: def sync_params(params): """ Synchronize a sequence of tensors across ranks from rank 0. """ for p in params: with th.no_grad(): p = p + 0 dist.broadcast(p, 0)
creating data loader...
creating model and diffusion...
training...
Traceback (most recent call last):
File "scripts/segmentation_train.py", line 118, in
main()
File "scripts/segmentation_train.py", line 70, in main
TrainLoop(
File "D:\MedSegDiff-master.\guided_diffusion\train_util.py", line 83, in init
self._load_and_sync_parameters()
File "D:\MedSegDiff-master.\guided_diffusion\train_util.py", line 139, in _load_and_sync_parameters
dist_util.sync_params(self.model.parameters())
File "D:\MedSegDiff-master.\guided_diffusion\dist_util.py", line 111, in sync_params
dist.broadcast(p, 0)
File "D:\Anaconda\envs\sg\lib\site-packages\torch\distributed\distributed_c10d.py", line 1195, in broadcast
work.wait()
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.
Plz!Tell me where the problem lies? and why? Hope someone nice can help me~ Thanks!!!
The text was updated successfully, but these errors were encountered: