You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately, 12GB is not enough to finetune the 3B model in the standard way (tuning all parameters). That is because of optimizer variables and values for gradient accumulation. This Hugging Face blog post briefly describes how much each of those parts contributes to VRAM usage.
For our model, we have used a single A100 80GB GPU and usage metrics show that > 70GB of the GPU memory was allocated.
How much vram needed to finetune 3b model? Is 12gb enough?
The text was updated successfully, but these errors were encountered: