torchzero

This is a work-in-progress optimizers library for pytorch with composable zeroth, first, second order and quasi newton methods, gradient approximation, line searches and a whole lot of other stuff.

Most optimizers are modular, meaning you can chain them like this:

optimizer = torchzero.optim.Modular(model.parameters(), [*list of modules*])`

For example you might use [ClipNorm(4), LR(1e-3), NesterovMomentum(0.9)] for standard SGD with gradient clipping and nesterov momentum. Move ClipNorm to the end to clip the update instead of the gradients. If you don't have access to gradients, add a RandomizedFDM() at the beginning to approximate them via randomized finite differences. Add Cautious() to make the optimizer cautious.

Each new module takes previous module update and works on it. That way there is no need to reimplement stuff like laplacian smoothing for all optimizers, and it is easy to experiment with grafting, interpolation between different optimizers, and perhaps some weirder combinations like nested momentum.

How to use

All modules are defined in torchzero.modules. You can generally mix and match them however you want. Some pre-made optimizers are available in torchzero.optim.

Some optimizers require closure, which should look like this:

def closure(backward = True):
  preds = model(inputs)
  loss = loss_fn(preds, targets)

  # if you can't call loss.backward(), and instead use gradient-free methods,
  # they always call closure with backward=False.
  # so you can remove the part below, but keep the unused backward argument.
  if backward:
    optimizer.zero_grad()
    loss.backward()
  return loss

optimizer.step(closure)

This closure will also work with all built in pytorch optimizers, including LBFGS, all optimizers in this library, as well as most custom ones.

SGD/Rprop/RMSProp/AdaGrad/Adam as composable modules. They are also tested to exactly match built in pytorch versions.
Cautious Optimizers (https://huggingface.co/papers/2411.16085)
Optimizer grafting (https://openreview.net/forum?id=FpKgG31Z_i9)
Laplacian smoothing (https://arxiv.org/abs/1806.06317)
Polyak momentum, nesterov momentum
Gradient norm and value clipping, gradient normalization
Gradient centralization (https://arxiv.org/abs/2004.01461)
Learning rate droput (https://pubmed.ncbi.nlm.nih.gov/35286266/).
Forward gradients (https://arxiv.org/abs/2202.08587)
Gradient approximation via finite difference or randomized finite difference, which includes SPSA, RDSA, FDSA and Gaussian smoothing (https://arxiv.org/abs/2211.13566v3)
Various line searches
Exact Newton's method (with Levenberg-Marquardt regularization), newton with hessian approximation via finite difference, subspace finite differences newton.
Directional newton via one additional forward pass

All modules should be quite fast, especially on models with many different parameters, due to _foreach operations.

I am getting to the point where I can start focusing on good docs and tests. As of now, the code should be considered experimental, untested and subject to change, so feel free but be careful if using this for actual project.

Wrappers

scipy.optimize.minimize wrapper

scipy.optimize.minimize wrapper with support for both gradient and hessian via batched autograd

from torchzero.optim.wrappers.scipy import ScipyMinimize
opt = ScipyMinimize(model.parameters(), method = 'trust-krylov')

Use as any other optimizer (make sure closure accepts backward argument like one from How to use). Note that it performs full minimization on each step.

Nevergrad wrapper

opt = NevergradOptimizer(bench.parameters(), ng.optimizers.NGOptBase, budget = 1000)

Use as any other optimizer (make sure closure accepts backward argument like one from How to use).

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
.github/workflows		.github/workflows
docs		docs
src/torchzero		src/torchzero
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
readthedocs.yaml		readthedocs.yaml
todos.txt		todos.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

torchzero

How to use

Contents

Wrappers

scipy.optimize.minimize wrapper

Nevergrad wrapper

About

Releases

Packages

Languages

License

inikishev/torchzero

Folders and files

Latest commit

History

Repository files navigation

torchzero

How to use

Contents

Wrappers

scipy.optimize.minimize wrapper

Nevergrad wrapper

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages