Skip to content

Commit

Permalink
best ckpt fix (NVIDIA#7564) (NVIDIA#7588)
Browse files Browse the repository at this point in the history
Signed-off-by: dimapihtar <[email protected]>
Co-authored-by: Dmytro Pykhtar <[email protected]>
Signed-off-by: Sasha Meister <[email protected]>
  • Loading branch information
2 people authored and ssh-meister committed Oct 10, 2023
1 parent 2f6fa29 commit 71f327f
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions nemo/utils/callbacks/nemo_model_checkpoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,8 @@ def on_train_end(self, trainer, pl_module):
"were found. Saving latest model instead."
)
else:
if os.path.isdir(self.best_model_path.split('.ckpt')[0]):
self.best_model_path = self.best_model_path.split('.ckpt')[0]
self.best_model_path = trainer.strategy.broadcast(self.best_model_path)
trainer._checkpoint_connector.restore(self.best_model_path)

Expand Down

0 comments on commit 71f327f

Please sign in to comment.