You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text was updated successfully, but these errors were encountered:
robogast
changed the title
Figure out where the NaNs in training are coming from:
Figure out where the NaNs in autoencoder training are coming from:
Apr 5, 2022
TL;DR: instability probably happens in the VQ-layer, but still unsure what exactly happens.
Increasing commitment loss, and making sure cdist compute_mode is non-mm seems to at least mitigate the issue.
Forcing 32-bit with torch.cuda.amp.autocast(enabled=False) doesn't solve the issue.
Source:
2021-11-01/12-14-29/0/lightning_logs/version_8317429
The text was updated successfully, but these errors were encountered: