Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor usability improvements to tinyllama pretraining script #749

Merged
merged 8 commits into from
Nov 21, 2023

Conversation

awaelchli
Copy link
Contributor

@awaelchli awaelchli commented Nov 17, 2023

A couple of improvements.

  • set max tokens instead of max steps / max iters as a more meaningful and interpretable target
  • break loop at the right place
  • updates to the initialization given recent changes in their repo
  • experimenting with hybrid shard for now

pretrain/tinyllama.py Outdated Show resolved Hide resolved
@awaelchli awaelchli marked this pull request as ready for review November 21, 2023 10:54
pretrain/tinyllama.py Outdated Show resolved Hide resolved
pretrain/tinyllama.py Show resolved Hide resolved
pretrain/tinyllama.py Outdated Show resolved Hide resolved
awaelchli and others added 2 commits November 21, 2023 11:07
Co-authored-by: Carlos Mocholí <[email protected]>
Co-authored-by: Carlos Mocholí <[email protected]>
@carmocca carmocca merged commit 21c1c59 into main Nov 21, 2023
9 checks passed
@carmocca carmocca deleted the tinyllama-tokens branch November 21, 2023 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants