Release V0.10.0 DeepSpeed integration revamp and TPU speedup · huggingface/accelerate

This release adds two major new features: the DeepSpeed integration has been revamped to match the one in Transformers Trainer, with multiple new options unlocked, and the TPU integration has been sped up.

This version also officially stops supporting Python 3.6 and requires Python 3.7+

DeepSpeed integration revamp

Users can now specify a DeepSpeed config file when they want to use DeepSpeed, which unlocks many new options. More details in the new documentation.

Migrate HFDeepSpeedConfig from trfrs to accelerate by @pacman100 in #432
DeepSpeed Revamp by @pacman100 in #405

TPU speedup

If you're using TPUs we have sped up the dataloaders and models quite a bit, on top of a few bug fixes.

Revamp TPU internals to be more efficient + enable mixed precision types by @muellerzr in #441

What's new?

Fix docstring by @muellerzr in #447
Add psutil as depenedency by @sgugger in #445
fix fsdp torch version dependency by @pacman100 in #437
Create Gradient Accumulation Example by @muellerzr in #431
init by @muellerzr in #429
Introduce no_sync context wrapper + clean up some more warnings for DDP by @muellerzr in #428
updating tests to resolve runner failures wrt deepspeed revamp by @pacman100 in #427
Fix secrets in Docker workflow by @muellerzr in #426
Introduce a Dependency Checker to trigger new Docker Builds on main by @muellerzr in #424
Enable slow tests nightly by @muellerzr in #421
Push out python 3.6 + fix all tests related to the upgrade by @muellerzr in #420
Speedup main CI by @muellerzr in #419
Switch to evaluate for metrics by @sgugger in #417
Create an issue template for Accelerate by @muellerzr in #415
Introduce post-merge runners by @muellerzr in #416
Fix debug_launcher issues by @muellerzr in #413
Use main egg by @muellerzr in #414
Introduce nightly runners by @muellerzr in #410
Update requirements to pin tensorboard and include psutil by @muellerzr in #408
Fix CUDA examples tests by @muellerzr in #407
Move datasets and transformers to under func by @muellerzr in #411
Fix CUDA Dockerfile by @muellerzr in #409
Hotfix all failing GPU tests by @muellerzr in #401
improve metrics logged in examples by @pacman100 in #399
Refactor offload_state_dict and fix in offload_weight by @sgugger in #398
Refactor version checking into a utility by @muellerzr in #395
Include fastai in frameworks by @muellerzr in #396
Add packaging to requirements by @muellerzr in #394
Better dispatch for submodules by @sgugger in #392
Build Docker Images nightly by @muellerzr in #391
Small bugfix for the stalebot workflow by @muellerzr in #390
Introduce stalebot by @muellerzr in #387
Create Dockerfiles for Accelerate by @muellerzr in #377
Mix precision -> Mixed precision by @muellerzr in #388
Fix OneCycle step length when in multiprocess by @muellerzr in #385

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V0.10.0 DeepSpeed integration revamp and TPU speedup

DeepSpeed integration revamp

TPU speedup

What's new?

Contributors