Request: Fine-Tuning Support for Any GGUF Model #8598
Replies: 5 comments 3 replies
-
@Spider-netizen why not just use a base model in Llama factory and then convert it to gguf? With Lora, you can finetune a 7B model with as low as 16 Gb of Vram. |
Beta Was this translation helpful? Give feedback.
-
Hopefully we will support this in the future, but it will take time to get there (currently we don't even have enough resources to maintain the existing training/finetuning examples) |
Beta Was this translation helpful? Give feedback.
-
I think BitNet will allow for really resource efficient fine tuning, and most importantly won't degrade quality as much as training it in 4 bit. |
Beta Was this translation helpful? Give feedback.
-
In the case of local computation (limited computing resources), I think RAG combined with gguf can produce good results. |
Beta Was this translation helpful? Give feedback.
-
The For now, you can do QLoRA finetune via huggingface PEFT and The output adapter can be converted to gguf via Support for switching different lora will be added to server soon. |
Beta Was this translation helpful? Give feedback.
-
Hi @ggerganov,
Is there any chance to add support for fine-tuning for any GGUF model? I’m especially interested in fine-tuning gemma2.
Thank you for your work!
Beta Was this translation helpful? Give feedback.
All reactions