--batch vs --batch-gpu #412

nadavpo · 2023-07-23T17:36:44Z

in the configs.md you mentioned the --batch-gpu parameter.
you also show example of running with 1 gpu where you set the --batch to 32 and the --batch-gpu to 16. what effect it have? if the batch size is 32, you restrict the samples per batch per gpu to 16 and you have only one gpu, doesn't that mean that your batch size is 16?

PDillis · 2023-10-20T00:02:22Z

Hello, I know this is an old question, but what they did is gradient accumulation: say you want to do a backward pass on a batch size of 32, but you can only fit a batch of 16 on your current GPU. So you accumulate two forward passes, then do the backward pass. It's this reason why, if you don't specify --batch-gpu and ony --batch, then this batch size will be divided by the number of GPUs you are using. Hope this helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--batch vs --batch-gpu #412

--batch vs --batch-gpu #412

nadavpo commented Jul 23, 2023 •

edited

Loading

PDillis commented Oct 20, 2023

--batch vs --batch-gpu #412

--batch vs --batch-gpu #412

Comments

nadavpo commented Jul 23, 2023 • edited Loading

PDillis commented Oct 20, 2023

nadavpo commented Jul 23, 2023 •

edited

Loading