Skip to content

v0.28.0: DataLoaderConfig, XLA improvements, FSDP + QLORA foundations, Gradient Synchronization Tweaks, and Bug Fixes

Compare
Choose a tag to compare
@muellerzr muellerzr released this 12 Mar 16:58
· 288 commits to main since this release

Core

  • Introduce a DataLoaderConfiguration and begin deprecation of arguments in the Accelerator
+from accelerate import DataLoaderConfiguration
+dl_config = DataLoaderConfiguration(split_batches=True, dispatch_batches=True)
-accelerator = Accelerator(split_batches=True, dispatch_batches=True)
+accelerator = Accelerator(dataloader_config=dl_config)
  • Allow gradients to be synced each data batch while performing gradient accumulation, useful when training in FSDP by @fabianlim in #2531
from accelerate import GradientAccumulationPlugin
plugin = GradientAccumulationPlugin(
+    num_steps=2, 
    sync_each_batch=sync_each_batch
)
accelerator = Accelerator(gradient_accumulation_plugin=plugin)

Torch XLA

  • Support for XLA on the GPU by @anw90 in #2176
  • Enable gradient accumulation on TPU in #2453

FSDP

  • Support downstream FSDP + QLORA support through tweaks by allowing configuration of buffer precision by @pacman100 in #2544

launch changes

What's Changed

New Contributors

Full Changelog: v0.27.2...v0.28.0