One more hint for what's going on.

allenai · Nov 28, 2024 · b41634f · b41634f
1 parent d74e835
commit b41634f
Showing 1 changed file with 4 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -94,6 +94,8 @@ For the 7B model, we train three times with different data order on 50B high qua
 | random seed 666        | [stage2-ingredient3-step11931-tokens50B](https://huggingface.co/allenai/OLMo-2-1124-7B/tree/stage2-ingredient3-step11931-tokens50B) | [OLMo2-7B-stage2-seed666.yaml](configs/official-1124/OLMo2-7B-stage2-seed666.yaml)     | link to come |
 | **final souped model** | [main](https://huggingface.co/allenai/OLMo-2-1124-7B/tree/main) | no config, we just averaged the weights in Python                                      | |
 
+The training configs linked here are set up to download the latest checkpoint after stage 1, and start training from there.
+
 ### Stage 2 for the 13B
 
 For the 13B model, we train three times with different data order on 100B high quality tokens, and one more time
@@ -107,6 +109,8 @@ on 300B high quality tokens. Then we average ("soup") the models.
 | random seed 2662, 300B | [stage2-ingredient4-step11931-tokens300B](https://huggingface.co/allenai/OLMo-2-1124-13B/tree/stage2-ingredient4-step35773-tokens300B) | [OLMo2-13B-stage2-seed2662-300B.yaml](configs/official-1124/OLMo2-13B-stage2-seed2662-300B.yaml) | link to come |
 | **final souped model** | [main](https://huggingface.co/allenai/OLMo-2-1124-13B/tree/main)                                                                       | no config, we just averaged the weights in Python                                                | |
 
+The training configs linked here are set up to download the latest checkpoint after stage 1, and start training from there.
+
 ## Instruction tuned variants
 
 For instruction tuned variants of these models, go to