Update dependency pytorch-lightning to v2.5.0 #18
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
2.4.0
->2.5.0
Release Notes
Lightning-AI/lightning (pytorch-lightning)
v2.5.0
: Lightning v2.5Compare Source
Lightning AI ⚡ is excited to announce the release of Lightning 2.5.
Lightning 2.5 comes with improvements on several fronts, with zero API changes. Our users love it stable, we keep it stable 😄.
Talking about love ❤️, the
lightning
,pytorch-lightning
andlightning-fabric
packages are collectively getting more than 10M downloads per month 😮, for a total of over 180M downloads 🤯 since the early days . It's incredible to see PyTorch Lightning getting such a strong adoption across the industry and the sciences.Release 2.5 embraces PyTorch 2.5, and it marks some of its more recent directions as officially supported, namely tensor subclass-based APIs like Distributed Tensors and TorchAO, in combination with
torch.compile
.Here's a couple of examples:
Distributed FP8 transformer with PyTorch Lightning
Full example here
Distributed FP8 transformer with Fabric
Full example here
As these examples show, it's now easier than ever to take your PyTorch Lightning module and run it with FSDP2 and/or tensor parallelism in FP8 precision, using the
ModelParallelStrategy
we introduced in 2.4.Also note the use of distributed tensor APIs, TorchAO APIs, and
torch.compile
directly in theconfigure_model
hook (or in the parallelize function in Fabric'sModelParallelStrategy
), as opposed to theLightningModule
as a whole. The advantage with this approach is that you can just copy-paste the parallelize functions that come with native PyTorch models directly inconfigure_model
and get the same effect, no head-scratching involved 🤓.Talking about head scratching, we also made a pass at the PyTorch Lightning internals and hardened the parts where we keep track of progress counters during training, validation, testing, as well as learning rate scheduling, in relation to resuming from checkpoints. We now made sure there are no (to the best of our knowledge) edge cases where stopping and resuming from checkpoints can change the sequence of loops or other internal states. Fault tolerance for the win 🥳!
Alright! Feel free to take a look at the full changelog below.
And of course: the best way to use PyTorch Lightning and Fabric is through Lightning Studio ⚡. Access GPUs, train models, deploy and more with zero setup. Focus on data and models - not infrastructure.
Changes
PyTorch Lightning
Added
step
parameter toTensorBoardLogger.log_hyperparams
to visualize changes during training (#20176)str
method to datamodule (#20301)Trainer.save_checkpoint
(#20405)Changed
resume_from_checkpoint
as deprecated (#20477)np.random.SeedSequence()
inpl_worker_init_function()
to robustly seed NumPy-dependent dataloader workers (#20369)2.5
(#20351)_
(#20221)BytesIO
as path in.to_onnx()
(#20172)Removed
List[int]
as input type for Trainer whenaccelerator="cpu"
(#20399)Fixed
convert_module
in FSDP to avoid using more memory than necessary during initialization (#20323)configure_optimizers
when running withReduceLROnPlateau
(#20471)configure_optimizers
example (#20420)_class_path
parameter (#20221)Lightning Fabric
Added
step
parameter toTensorBoardLogger.log_hyperparams
to visualize changes during training (#20176)ddp_find_unused_parameters_true
alias in Fabric's DDPStrategy (#20125)Changed
np.random.SeedSequence()
inpl_worker_init_function()
to robustly seed NumPy-dependent dataloader workers (#20369)2.5
(#20351)Removed
Fixed
convert_module
in FSDP to avoid using more memory than necessary during initialization (#20323)Full commit list: 2.4.0 -> 2.5.0
Contributors
We thank all folks who submitted issues, features, fixes and doc changes. It's the only way we can collectively make Lightning ⚡ better for everyone, nice job!
In particular, we would like to thank the authors of the pull-requests above, in no particular order:
@ringohoffman @MrWhatZitToYaa @jedyang97 @chualanagit @lantiga @AlessandroW @kazuar @t-vi @01AbhiSingh @WangYue0000 @amorehead @EricCousineau-TRI @mauvilsa @Borda @pete-mcelroy @ali-alshaar7 @GdoongMathew @farhadrgh @tshu-w @LukasSalchow @awindmann @dadwadw233 @qingquansong
Thank you ❤️ and we hope you'll keep them coming!
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR was generated by Mend Renovate. View the repository job log.