Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the OLMo2 Stage 2 training procedures: was the optimizer state from Stage 1 used during the training of Stage 2? #758

Open
Taoer1996 opened this issue Nov 29, 2024 · 2 comments
Labels
type/question An issue that's a question

Comments

@Taoer1996
Copy link

❓ The question

Thank you for the fully open-sourced OLMo2; that's really amazing research work!!

And we have a small question regarding the training process of Stage 2. Was the optimizer state from Stage 1 used during the training of Stage 2? If so, will the full checkpoint, including the optimizer state, be open-sourced?

@Taoer1996 Taoer1996 added the type/question An issue that's a question label Nov 29, 2024
@aman-17
Copy link
Member

aman-17 commented Dec 3, 2024

Hey @Taoer1996, we loaded the final checkpoint from Stage-1 for Stage-2, and the optimizer state wasn’t reset. As for open-sourcing the full checkpoints with the optimizer state, we are planning to release them, I’ll keep you posted if there are any updates on this.

@Taoer1996
Copy link
Author

Hey @Taoer1996, we loaded the final checkpoint from Stage-1 for Stage-2, and the optimizer state wasn’t reset. As for open-sourcing the full checkpoints with the optimizer state, we are planning to release them, I’ll keep you posted if there are any updates on this.

Thanks for clarifying that! Really excited to see what comes next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/question An issue that's a question
Projects
None yet
Development

No branches or pull requests

2 participants