forked from NVIDIA/NeMo
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[NeMo-UX] checkpointing improvements (NVIDIA#10241)
* save model weights and artifacts to separate directories Signed-off-by: ashors1 <[email protected]> * add save_artifacts_on_train_end Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * do not save optimizer states in final checkpoint Signed-off-by: ashors1 <[email protected]> * WIP support for saving only last k optimizer states Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * minor cleanup Signed-off-by: ashors1 <[email protected]> * Revert support for saving last k optimizer states. This will be addressed in a subsequent PR. * use storage_options to determine when to skip saving optimizer states Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * fix variable names, make checkpoint load work when optimizer states don't exist in the checkpoint Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * FSDP updates, provide option to save optimizer states on_train_end Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * simplify implementation, remove save_best_model option Signed-off-by: ashors1 <[email protected]> * update default value of ckpt_include_optimizer for fsdp Signed-off-by: ashors1 <[email protected]> * remove unused imports Signed-off-by: ashors1 <[email protected]> * remove unused import Signed-off-by: ashors1 <[email protected]> * cleanup Signed-off-by: ashors1 <[email protected]> * make storage_options optional again Signed-off-by: ashors1 <[email protected]> * fix failing tests Signed-off-by: ashors1 <[email protected]> * address some comments Signed-off-by: ashors1 <[email protected]> * use save_weights_only to determine whether to save optimizer states Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * add some comments Signed-off-by: ashors1 <[email protected]> * fix tests Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * fixes Signed-off-by: ashors1 <[email protected]> * Apply isort and black reformatting Signed-off-by: ashors1 <[email protected]> * remove unnecessary line Signed-off-by: ashors1 <[email protected]> --------- Signed-off-by: ashors1 <[email protected]> Signed-off-by: ashors1 <[email protected]> Co-authored-by: ashors1 <[email protected]>
- Loading branch information
Showing
9 changed files
with
93 additions
and
60 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters