-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load config from checkpoint_dir #755
Load config from checkpoint_dir #755
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I don't think any tutorials need an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, why would any script other than prepare_redpajama
use this? In all the other scripts that were changed, we want to load a directory containing existing weights, so the new function is unnecessary and the scripts will still fail later if the weights do not exit. Shouldn't this only modify prepare_redpajama.py
?
My intent was to have a uniformity across the whole repo when it comes to loading P.S. Is there a chance that weights exist, but not |
No unless the user made changes or moved files. For the scripts that are meant to load weights, I prefer to have this fail with a missing file error instead of loading a config that could not match the weights file |
Hi there 👋
As was discussed in #721, this PR adds functionality to first try to load
lit_config.json
file fromcheckpoint_dir
and, if it doesn't exist, try to load a config fromlit_gpt/config.py
by matching the name of thecheckpoint_dir
.The only pivot from the discussion is that instead of adding such functionality into
utils.py
file, this PR extends Config class, so we don't have a problem with different versions of the class (e.g. LoRA version of Config).