Skip to content

Commit

Permalink
Post-merge file & documentation fixes
Browse files Browse the repository at this point in the history
  • Loading branch information
calpt committed Mar 23, 2022
1 parent fb2beba commit d8bcf37
Show file tree
Hide file tree
Showing 18 changed files with 80 additions and 158 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ To get started with adapters, refer to these locations:
- **[Colab notebook tutorials](https://github.com/Adapter-Hub/adapter-transformers/tree/master/notebooks)**, a series notebooks providing an introduction to all the main concepts of (adapter-)transformers and AdapterHub
- **https://docs.adapterhub.ml**, our documentation on training and using adapters with _adapter-transformers_
- **https://adapterhub.ml** to explore available pre-trained adapter modules and share your own adapters
- **[Examples folder](https://github.com/Adapter-Hub/adapter-transformers/tree/master/examples)** of this repository containing HuggingFace's example training scripts, many adapted for training adapters
- **[Examples folder](https://github.com/Adapter-Hub/adapter-transformers/tree/master/examples/pytorch)** of this repository containing HuggingFace's example training scripts, many adapted for training adapters

## Implemented Methods

Expand Down
14 changes: 7 additions & 7 deletions adapter_docs/adapter_composition.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ model.active_adapters = "adapter_name"

Note that we also could have used `model.set_active_adapters("adapter_name")` which does the same.

```eval_rst
```{eval-rst}
.. important::
``active_adapters`` defines which of the available adapters are used in each forward and backward pass through the model. This means:
Expand All @@ -39,7 +39,7 @@ They are presented in more detail in the following.

## `Stack`

```eval_rst
```{eval-rst}
.. figure:: img/stacking_adapters.png
:height: 300
:align: center
Expand Down Expand Up @@ -71,7 +71,7 @@ For backwards compatibility, you can still do this, although it is recommended t

## `Fuse`

```eval_rst
```{eval-rst}
.. figure:: img/Fusion.png
:height: 300
:align: center
Expand All @@ -98,7 +98,7 @@ model.add_adapter_fusion(["d", "e", "f"])
model.active_adapters = ac.Fuse("d", "e", "f")
```

```eval_rst
```{eval-rst}
.. important::
Fusing adapters with the ``Fuse`` block only works successfully if an adapter fusion layer combining all of the adapters listed in the ``Fuse`` has been added to the model.
This can be done either using ``add_adapter_fusion()`` or ``load_adapter_fusion()``.
Expand All @@ -111,7 +111,7 @@ For backwards compatibility, you can still do this, although it is recommended t

## `Split`

```eval_rst
```{eval-rst}
.. figure:: img/splitting_adapters.png
:height: 300
:align: center
Expand Down Expand Up @@ -159,7 +159,7 @@ model.active_adapters = ac.BatchSplit("i", "k", "l", batch_sizes=[2, 1, 2])

## `Parallel`

```eval_rst
```{eval-rst}
.. figure:: img/parallel.png
:height: 300
:align: center
Expand Down Expand Up @@ -206,7 +206,7 @@ model.active_adapters = ac.Stack("a", ac.Split("b", "c", split_index=60))

However, combinations of adapter composition blocks cannot be arbitrarily deep. All currently supported possibilities are visualized in the figure below.

```eval_rst
```{eval-rst}
.. figure:: img/adapter_blocks_nesting.png
:height: 300
:align: center
Expand Down
6 changes: 6 additions & 0 deletions adapter_docs/classes/adapter_config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,12 @@ Single (bottleneck) adapters
.. autoclass:: transformers.ParallelConfig
:members:

.. autoclass:: transformers.CompacterConfig
:members:

.. autoclass:: transformers.CompacterPlusPlusConfig
:members:

Prefix Tuning
~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
3 changes: 0 additions & 3 deletions adapter_docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@
import os
import sys

from recommonmark.transform import AutoStructify


# -- Path setup --------------------------------------------------------------

Expand Down Expand Up @@ -90,5 +88,4 @@

def setup(app):
app.add_config_value("recommonmark_config", {"enable_eval_rst": True}, True)
app.add_transform(AutoStructify)
app.add_css_file("custom.css")
4 changes: 2 additions & 2 deletions adapter_docs/contributing.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Contributing to AdapterHub

```eval_rst
```{eval-rst}
.. note::
This document describes how to contribute adapters via the AdapterHub `Hub repository <https://github.com/adapter-hub/hub>`_. See `Integration with HuggingFace's Model Hub <huggingface_hub.html>`_ for uploading adapters via the HuggingFace Model Hub.
```
Expand Down Expand Up @@ -49,7 +49,7 @@ Let's go through the upload process step by step:
```
`adapter-hub-cli` will search for available adapters in the path you specify and interactively lead you through the packing process.

```eval_rst
```{eval-rst}
.. note::
The configuration of the adapter is specified by an identifier string in the YAML file. This string should refer to an adapter architecture available in the Hub. If you use a new or custom architecture, make sure to also `add an entry for your architecture <#add-a-new-adapter-architecture>`_ to the repo.
```
Expand Down
4 changes: 2 additions & 2 deletions adapter_docs/huggingface_hub.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Integration with HuggingFace's Model Hub

```eval_rst
```{eval-rst}
.. figure:: img/hfhub.svg
:align: center
:alt: HuggingFace Hub logo.
Expand Down Expand Up @@ -53,7 +53,7 @@ For more options and information, e.g. for managing models via the CLI and Git,
This will create a repository `my-awesome-adapter` under your username, generate a default adapter card as `README.md` and upload the adapter named `awesome_adapter` together with the adapter card to the new repository.
`adapterhub_tag` and `datasets_tag` provide additional information for categorization.

```eval_rst
```{eval-rst}
.. important::
All adapters uploaded to HuggingFace's Model Hub are automatically also listed on AdapterHub.ml. Thus, for better categorization, either ``adapterhub_tag`` or ``datasets_tag`` is required when uploading a new adapter to the Model Hub.

Expand Down
2 changes: 1 addition & 1 deletion adapter_docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Our *adapter-transformers* package is a drop-in replacement for Huggingface's *transformers* library.
It currently supports Python 3.6+ and PyTorch 1.3.1+. You will have to [install PyTorch](https://pytorch.org/get-started/locally/) first.

```eval_rst
```{eval-rst}
.. important::
``adapter-transformers`` is a direct fork of ``transformers``.
This means our package includes all the awesome features of HuggingFace's original package plus the adapter implementation.
Expand Down
2 changes: 1 addition & 1 deletion adapter_docs/loading.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ The identifier string used to find a matching adapter follows a format consistin

An example of a full identifier following this format might look like `qa/squad1.1@example-org`.

```eval_rst
```{eval-rst}
.. important::
In many cases, you don't have to give the full string identifier with all three components to successfully load an adapter from the Hub. You can drop the `<username>` you don't care about the uploader of the adapter. Also, if the resulting identifier is still unique, you can drop the ``<task>`` or the ``<subtask>``. So, ``qa/squad1.1``, ``squad1.1`` or ``squad1.1@example-org`` all may be valid identifiers.
```
Expand Down
2 changes: 1 addition & 1 deletion adapter_docs/model_overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This page gives an overview of the Transformer models currently supported by `adapter-transformers`.
The table below further shows which model architectures support which adaptation methods and which features of `adapter-transformers`.

```eval_rst
```{eval-rst}
.. note::
Each supported model architecture X typically provides a class ``XAdapterModel`` for usage with ``AutoAdapterModel``.
Additionally, it is possible to use adapters with the model classes already shipped with HuggingFace Transformers.
Expand Down
8 changes: 4 additions & 4 deletions adapter_docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ config = ... # config class deriving from AdapterConfigBase
model.add_adapter("name", config=config)
```

```eval_rst
```{eval-rst}
.. important::
In literature, different terms are used to refer to efficient fine-tuning methods.
The term "adapter" is usually only applied to bottleneck adapter modules.
Expand Down Expand Up @@ -67,7 +67,7 @@ $$
A visualization of further configuration options related to the adapter structure is given in the figure below. For more details, refer to the documentation of [`AdapterConfig`](transformers.AdapterConfig).


```eval_rst
```{eval-rst}
.. figure:: img/architecture.png
:width: 350
:align: center
Expand Down Expand Up @@ -120,7 +120,7 @@ model.add_adapter("lang_adapter", config=config)
_Papers:_
- [MAD-X: An Adapter-based Framework for Multi-task Cross-lingual Transfer](https://arxiv.org/pdf/2005.00052.pdf) (Pfeiffer et al., 2020)

```eval_rst
```{eval-rst}
.. note::
V1.x of adapter-transformers made a distinction between task adapters (without invertible adapters) and language adapters (with invertible adapters) with the help of the ``AdapterType`` enumeration.
This distinction was dropped with v2.x.
Expand Down Expand Up @@ -171,7 +171,7 @@ for a PHM layer by specifying `use_phm=True` in the config.
The PHM layer has the following additional properties: `phm_dim`, `shared_phm_rule`, `factorized_phm_rule`, `learn_phm`,
`factorized_phm_W`, `shared_W_phm`, `phm_c_init`, `phm_init_range`, `hypercomplex_nonlinearity`

For more information check out the [AdapterConfig](classes/adapter_config.html#transformers.AdapterConfig) class.
For more information check out the [`AdapterConfig`](transformers.AdapterConfig) class.

To add a Compacter to your model you can use the predefined configs:
```python
Expand Down
6 changes: 3 additions & 3 deletions adapter_docs/prediction_heads.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
This section gives an overview how different prediction heads can be used together with adapter modules and how pre-trained adapters can be distributed side-by-side with matching prediction heads in AdapterHub.
We will take a look at the `AdapterModel` classes (e.g. `BertAdapterModel`) introduced by adapter-transformers, which provide **flexible** support for prediction heads, as well as models with **static** heads provided out-of-the-box by HuggingFace Transformers (e.g. `BertForSequenceClassification`).

```eval_rst
```{eval-rst}
.. tip::
We recommend to use the `AdapterModel classes <#adaptermodel-classes>`_ whenever possible.
They have been created specifically for working with adapters and provide more flexibility.
Expand Down Expand Up @@ -37,7 +37,7 @@ Since we gave the task adapter the same name as our head, we can easily identify
The call to `set_active_adapters()` in the second line tells our model to use the adapter - head configuration we specified by default in a forward pass.
At this point, we can start to [train our setup](training.md).

```eval_rst
```{eval-rst}
.. note::
The ``set_active_adapters()`` will search for an adapter and a prediction head with the given name to be activated.
Alternatively, prediction heads can also be activated explicitly (i.e. without adapter modules).
Expand Down Expand Up @@ -87,7 +87,7 @@ In case the classes match, our prediction head weights will be automatically loa

## Automatic conversion

```eval_rst
```{eval-rst}
.. important::
Although the two prediction head implementations serve the same use case, their weights are *not* directly compatible, i.e. you cannot load a head created with ``AutoAdapterModel`` into a model of type ``AutoModelForSequenceClassification``.
There is however an automatic conversion to model classes with flexible heads.
Expand Down
2 changes: 1 addition & 1 deletion adapter_docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Currently, *adapter-transformers* adds adapter components to the PyTorch impleme
For working with adapters, a couple of methods for creation (`add_adapter()`), loading (`load_adapter()`),
storing (`save_adapter()`) and deletion (`delete_adapter()`) are added to the model classes. In the following, we will briefly go through some examples.

```eval_rst
```{eval-rst}
.. note::
This document focuses on the adapter-related functionalities added by *adapter-transformers*.
For a more general overview of the *transformers* library, visit
Expand Down
10 changes: 5 additions & 5 deletions adapter_docs/training.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ if task_name not in model.config.adapters:
model.train_adapter(task_name)
```

```eval_rst
```{eval-rst}
.. important::
The most crucial step when training an adapter module is to freeze all weights in the model except for those of the
adapter. In the previous snippet, this is achieved by calling the ``train_adapter()`` method which disables training
Expand Down Expand Up @@ -90,12 +90,12 @@ python run_glue.py \

The important flag here is `--train_adapter` which switches from fine-tuning the full model to training an adapter module for the given GLUE task.

```eval_rst
```{eval-rst}
.. tip::
Adapter weights are usually initialized randomly. That is why we require a higher learning rate. We have found that a default adapter learning rate of ``1e-4`` works well for most settings.
```

```eval_rst
```{eval-rst}
.. tip::
Depending on your data set size you might also need to train longer than usual. To avoid overfitting you can evaluating the adapters after each epoch on the development set and only save the best model.
```
Expand Down Expand Up @@ -129,7 +129,7 @@ python run_mlm.py \
We provide an example for training _AdapterFusion_ ([Pfeiffer et al., 2020](https://arxiv.org/pdf/2005.00247)) on the GLUE dataset: [run_fusion_glue.py](https://github.com/Adapter-Hub/adapter-transformers/blob/master/examples/adapterfusion/run_fusion_glue.py).
You can adapt this script to train AdapterFusion with different pre-trained adapters on your own dataset.

```eval_rst
```{eval-rst}
.. important::
AdapterFusion on a target task is trained in a second training stage, after independently training adapters on individual tasks.
When setting up a fusion architecture on your model, make sure to load the pre-trained adapter modules to be fused using ``model.load_adapter()`` before adding a fusion layer.
Expand Down Expand Up @@ -180,7 +180,7 @@ trainer = AdapterTrainer(
data_collator=data_collator,
)
```
```eval_rst
```{eval-rst}
.. tip::
When you migrate from the previous versions, which use the Trainer class for adapter training and fully fine-tuning, note that the
specialized AdapterTrainer class does not have the parameters `do_save_full_model`, `do_save_adapters` and `do_save_adapter_fusion`.
Expand Down
2 changes: 1 addition & 1 deletion adapter_docs/v2_transition.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ model.active_adapters = "awesome_adapter"
model(**input_data)
```

```eval_rst
```{eval-rst}
.. note::
Version 2.0.0 temporarily removed the ``adapter_names`` parameter entirely.
Due to user feedback regarding limitations of the ``active_adapters`` property in multi-threaded contexts,
Expand Down
80 changes: 0 additions & 80 deletions examples/README.md

This file was deleted.

Loading

0 comments on commit d8bcf37

Please sign in to comment.