Skip to content

Commit

Permalink
fix tokenizer
Browse files Browse the repository at this point in the history
  • Loading branch information
ysjprojects committed Dec 3, 2024
1 parent 5c726b0 commit 6dd3353
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion litgpt/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ def check_if_bos_token_used(self, checkpoint_dir: Path) -> bool:
config = json.load(fp)
# for LlaMA-3 tokenizer there is no `add_bos_token` at all and `tokenizer_class` is only
# `PreTrainedTokenizerFast`
if checkpoint_dir.stem.startswith(("Meta-Llama-3", "Llama-3")):
if checkpoint_dir.stem.startswith(("Meta-Llama-3", "Llama-3", "SmolLM2")):
return True
if "add_bos_token" in config:
return config["add_bos_token"]
Expand Down

0 comments on commit 6dd3353

Please sign in to comment.