Skip to content

Commit

Permalink
Mistral tokenizer to avoid the HF token
Browse files Browse the repository at this point in the history
  • Loading branch information
carmocca committed Apr 13, 2024
1 parent f5c5c34 commit b985c4a
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion extensions/thunder/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -567,13 +567,15 @@ Config:
```yaml
out_dir: out/pretrain-thunder
data: TinyStories
tokenizer_dir: checkpoints/meta-llama/Llama-2-7b-hf
tokenizer_dir: checkpoints/mistralai/Mistral-7B-v0.1
logger_name: csv
```
Commands:
```bash
litgpt download --repo_id mistralai/Mistral-7B-v0.1 --tokenizer_only true

python extensions/thunder/pretrain.py --config config.yaml --compiler null --train.global_batch_size 32
python extensions/thunder/pretrain.py --config config.yaml --executors '[torchcompile_complete]' --train.global_batch_size 32
python extensions/thunder/pretrain.py --config config.yaml --executors '[sdpa, torchcompile, nvfuser, torch]' --train.global_batch_size 32
Expand Down

0 comments on commit b985c4a

Please sign in to comment.