Minor usability improvements to tinyllama pretraining script #749

awaelchli · 2023-11-17T22:23:35Z

A couple of improvements.

set max tokens instead of max steps / max iters as a more meaningful and interpretable target
break loop at the right place
updates to the initialization given recent changes in their repo
experimenting with hybrid shard for now

pretrain/tinyllama.py

Co-authored-by: Carlos Mocholí <[email protected]>

tokens

85cff8e

awaelchli commented Nov 17, 2023

View reviewed changes

pretrain/tinyllama.py Outdated Show resolved Hide resolved

updates

881d4d6

awaelchli force-pushed the tinyllama-tokens branch from 5556cdb to 881d4d6 Compare November 21, 2023 10:45

awaelchli added 3 commits November 21, 2023 10:46

fix

7cfdd5f

fix

24e4cfb

update

10225ed

awaelchli marked this pull request as ready for review November 21, 2023 10:54

awaelchli requested review from carmocca and lantiga as code owners November 21, 2023 10:54

update

0a92638

carmocca approved these changes Nov 21, 2023

View reviewed changes

pretrain/tinyllama.py Outdated Show resolved Hide resolved

pretrain/tinyllama.py Show resolved Hide resolved

pretrain/tinyllama.py Outdated Show resolved Hide resolved

awaelchli and others added 2 commits November 21, 2023 11:07

Update pretrain/tinyllama.py

72f4d87

Co-authored-by: Carlos Mocholí <[email protected]>

Update pretrain/tinyllama.py

b989a94

Co-authored-by: Carlos Mocholí <[email protected]>

carmocca merged commit 21c1c59 into main Nov 21, 2023
9 checks passed

carmocca deleted the tinyllama-tokens branch November 21, 2023 23:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minor usability improvements to tinyllama pretraining script #749

Minor usability improvements to tinyllama pretraining script #749

awaelchli commented Nov 17, 2023 •

edited

Loading

Minor usability improvements to tinyllama pretraining script #749

Minor usability improvements to tinyllama pretraining script #749

Conversation

awaelchli commented Nov 17, 2023 • edited Loading

awaelchli commented Nov 17, 2023 •

edited

Loading