Merge pull request #77 from tfjgeorge/tfjgeorge-patch-1

Update Readme.md
tfjgeorge · Dec 5, 2023 · 304bffe · 304bffe
2 parents 7e0e88f + 542383b
commit 304bffe
Showing 1 changed file with 6 additions and 4 deletions.
diff --git a/Readme.md b/Readme.md
@@ -2,18 +2,18 @@
 
 ![Build Status](https://github.com/tfjgeorge/nngeometry/actions/workflows/nngeometry.yml/badge.svg) [![codecov](https://codecov.io/gh/tfjgeorge/nngeometry/branch/master/graph/badge.svg)](https://codecov.io/gh/tfjgeorge/nngeometry) [![DOI](https://zenodo.org/badge/208082966.svg)](https://zenodo.org/badge/latestdoi/208082966) [![PyPI version](https://badge.fury.io/py/nngeometry.svg)](https://badge.fury.io/py/nngeometry)
 
-
-
 NNGeometry allows you to:
- - compute **Fisher Information Matrices** (FIM) or derivates, using efficient approximations such as low-rank matrices, KFAC, diagonal and so on.
+ - compute Gauss-Newton or **Fisher Information Matrices** (FIM), as well as any matrix that is written as the covariance of gradients w.r.t. parameters, using efficient approximations such as low-rank matrices, KFAC, EKFAC, diagonal and so on.
  - compute finite-width **Neural Tangent Kernels** (Gram matrices), even for multiple output functions.
  - compute **per-examples jacobians** of the loss w.r.t network parameters, or of any function such as the network's output.
  - easily and efficiently compute linear algebra operations involving these matrices **regardless of their approximation**.
  - compute **implicit** operations on these matrices, that do not require explicitely storing large matrices that would not fit in memory.
 
+It offers a high level abstraction over the parameter and function spaces described by neural networks. As a simple example, a parameter space vector `PVector` actually contains weight matrices, bias vectors, or convolutions kernels of the whole neural network (a set of tensors). Using NNGeometry's API, performing a step in parameter space (e.g. an update of your favorite optimization algorithm) is abstracted as a python addition: `w_next = w_previous + epsilon * delta_w`.
+
 ## Example
 
-In the Elastic Weight Consolidation continual learning technique, you want to compute <img src="https://render.githubusercontent.com/render/math?math=\left(\mathbf{w}-\mathbf{w}_{A}\right)^{\top}F\left(\mathbf{w}-\mathbf{w}_{A}\right)">. It can be achieved with a diagonal approximation for the FIM using: 
+In the Elastic Weight Consolidation continual learning technique, you want to compute $`\left(\mathbf{w}-\mathbf{w}_{A}\right)^{\top}F\left(\mathbf{w}-\mathbf{w}_{A}\right)`$. It can be achieved with a diagonal approximation for the FIM using: 
 ```python
 F = FIM(model=model,
         loader=loader,
@@ -22,6 +22,8 @@ F = FIM(model=model,
 
 regularizer = F.vTMv(w - w_a)
 ```
+The first statement instantiates a diagonal matrix, and populates it with the diagonal coefficients of the FIM of the model `model` computed using the examples from the dataloader `loader`.
+
 If diagonal is not sufficiently accurate then you could instead choose a KFAC approximation, by just changing `PMatDiag` to `PMatKFAC` in the above. Note that it internally involves very different operations, depending on the chosen representation (e.g. KFAC, EKFAC, ...).
 
 ## Documentation