-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Modify Torchtune to train Ichigo Qwen #125
Comments
Sync Torchtune from Upstream
|
|
Solution here? I think the implication behind this mismatch is dueto they want to optimize the training process with the number // 128 So we can learn from them. First let's expand the vocab from the ground truth vocab (tokenizer vocab) then we can further add the padding embeds later. refenrence from HF: https://github.com/huggingface/transformers/blob/40821a247823b35d7ff10ba490d0d930fe8f5afa/src/transformers/models/idefics2/modeling_idefics2.py#L1289 |
i added the training code for Ichigo Qwen2.5 family as base to torchtune codebase. its in the dev branch soon be merged into the main. i think we can close this issue cc @tikikun @hahuyhoang411 |
Proble
Our training pipeline currently supports the standard Qwen model but requires two critical modifications:
Background
Goal:
The text was updated successfully, but these errors were encountered: