Skip to content

Commit

Permalink
Merge branch 'xdev-prepare' into dev
Browse files Browse the repository at this point in the history
  • Loading branch information
nmichlo committed Mar 31, 2022
2 parents 5695747 + 6ed30c8 commit 392934d
Show file tree
Hide file tree
Showing 148 changed files with 5,666 additions and 2,813 deletions.
205 changes: 123 additions & 82 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@
</p>

<p align="center">
<a href="https://choosealicense.com/licenses/mit/">
<a href="https://choosealicense.com/licenses/mit/" target="_blank">
<img alt="license" src="https://img.shields.io/github/license/nmichlo/disent?style=flat-square&color=lightgrey"/>
</a>
<a href="https://pypi.org/project/disent">
<a href="https://pypi.org/project/disent" target="_blank">
<img alt="python versions" src="https://img.shields.io/pypi/pyversions/disent?style=flat-square"/>
</a>
<a href="https://pypi.org/project/disent">
<a href="https://pypi.org/project/disent" target="_blank">
<img alt="pypi version" src="https://img.shields.io/pypi/v/disent?style=flat-square&color=blue"/>
</a>
<a href="https://github.com/nmichlo/disent/actions?query=workflow%3Atest">
Expand All @@ -29,7 +29,7 @@

<p align="center">
<p align="center">
Visit the <a href="https://disent.dontpanic.sh/">docs</a> for more info, or browse the <a href="https://github.com/nmichlo/disent/releases">releases</a>.
Visit the <a href="https://disent.dontpanic.sh/" target="_blank">docs</a> for more info, or browse the <a href="https://github.com/nmichlo/disent/releases">releases</a>.
</p>
<p align="center">
<a href="https://github.com/nmichlo/disent/issues/new/choose">Contributions</a> are welcome!
Expand All @@ -42,10 +42,11 @@

- [Overview](#overview)
- [Features](#features)
* [Datasets](#datasets)
* [Frameworks](#frameworks)
* [Metrics](#metrics)
* [Datasets](#datasets)
* [Schedules & Annealing](#schedules--annealing)
- [Architecture](#architecture)
- [Examples](#examples)
* [Python Example](#python-example)
* [Hydra Config Example](#hydra-config-example)
Expand Down Expand Up @@ -88,65 +89,94 @@ Please use the following citation if you use Disent in your own research:

----------------------

## Architecture

The disent module structure:
## Features

- `disent.dataset`: dataset wrappers, datasets & sampling strategies
+ `disent.dataset.data`: raw datasets
+ `disent.dataset.sampling`: sampling strategies for `DisentDataset` when multiple elements are required by frameworks, eg. for triplet loss
+ `disent.dataset.transform`: common data transforms and augmentations
+ `disent.dataset.wrapper`: wrapped datasets are no longer ground-truth datasets, these may have some elements masked out. We can still unwrap these classes to obtain the original datasets for benchmarking.
- `disent.frameworks`: frameworks, including Auto-Encoders and VAEs
+ `disent.frameworks.ae`: Auto-Encoder based frameworks
+ `disent.frameworks.vae`: Variational Auto-Encoder based frameworks
- `disent.metrics`: metrics for evaluating disentanglement using ground truth datasets
- `disent.model`: common encoder and decoder models used for VAE research
- `disent.nn`: torch components for building models including layers, transforms, losses and general maths
- `disent.schedule`: annealing schedules that can be registered to a framework
- `disent.util`: helper classes, functions, callbacks, anything unrelated to a pytorch system/model/framework.
Disent includes implementations of modules, metrics and
datasets from various papers.

**Please Note The API Is Still Unstable ⚠️**
_Note that "🧵" means that the dataset, framework or metric was introduced by disent!_

Disent is still under active development. Features and APIs are mostly stable but may change! A limited
set of tests currently exist which will be expanded upon in time.
### Datasets

**Hydra Experiment Directories**
Various common datasets used in disentanglement research are included, with hash
verification and automatic chunk-size optimization of underlying hdf5 formats for
low-memory disk-based access.

Easily run experiments with hydra config, these files
are not available from `pip install`.
Data input and target dataset augmentations and transforms are supported, as well as augmentations
on the GPU or CPU at different points in the pipeline.

- `experiment/run.py`: entrypoint for running basic experiments with [hydra](https://github.com/facebookresearch/hydra) config
- `experiment/config/config.yaml`: main configuration file, this is probably what you want to edit!
- `experiment/config`: root folder for [hydra](https://github.com/facebookresearch/hydra) config files
- `experiment/util`: various helper code for experiments
- **Ground Truth**:
+ <details>
<summary>🚗 <a href="https://papers.nips.cc/paper/5845-deep-visual-analogy-making" target="_blank">Cars3D</a></summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__cars3d.jpg" alt="Cars3D Dataset Factor Traversals"></p>
</details>

+ <details>
<summary>◻️ <a href="https://github.com/deepmind/dsprites-dataset" target="_blank">dSprites</a></summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__dsprites.jpg" alt="dSprites Dataset Factor Traversals"></p>
</details>

+ <details>
<summary>🔺 <a href="https://arxiv.org/abs/1906.03292" target="_blank">MPI3D</a></summary>
<p align="center">🏗 Todo</p>
</details>

+ <details>
<summary>🐘 <a href="https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/" target="_blank">SmallNORB</a></summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__smallnorb.jpg" alt="Small Norb Dataset Factor Traversals"></p>
</details>

+ <details>
<summary>🌈 <a href="https://github.com/deepmind/3d-shapes" target="_blank">Shapes3D</a></summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__shapes3d.jpg" alt="Shapes3D Dataset Factor Traversals"></p>
</details>

+ <details open>
<summary>
🧵 <u>dSpritesImagenet</u>:
<i>Version of DSprite with foreground or background deterministically masked out with tiny-imagenet data.</i>
</summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__dsprites-imagenet-bg-100.jpg" alt="dSpritesImagenet Dataset Factor Traversals"></p>
</details>

----------------------
- **Ground Truth Synthetic**:
+ <details>
<summary>
🧵 <u>XYObject</u>:
<i>A simplistic version of dSprites with a single square.</i>
</summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__xy-object.jpg" alt="XYObject Dataset Factor Traversals"></p>
</details>

+ <details open>
<summary>
🧵 <u>XYObjectShaded</u>:
<i>Exact same dataset as XYObject, but ground truth factors have a different representation.</i>
</summary>
<p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__xy-object-shaded.jpg" alt="XYObjectShaded Dataset Factor Traversals"></p>
</details>

## Features
### Frameworks

Disent includes implementations of modules, metrics and
datasets from various papers. Please note that items marked
with a "🧵" are introduced in and are unique to disent!
Disent provides the following Auto-Encoders and Variational Auto-Encoders!

### Frameworks
- **Unsupervised**:
+ [VAE](https://arxiv.org/abs/1312.6114)
+ [Beta-VAE](https://openreview.net/forum?id=Sy2fzU9gl)
+ [DFC-VAE](https://arxiv.org/abs/1610.00291)
+ [DIP-VAE](https://arxiv.org/abs/1711.00848)
+ [InfoVAE](https://arxiv.org/abs/1706.02262)
+ [BetaTCVAE](https://arxiv.org/abs/1802.04942)
+ <u>AE</u>: _Auto-Encoder_
+ [VAE](https://arxiv.org/abs/1312.6114): Variational Auto-Encoder
+ [Beta-VAE](https://openreview.net/forum?id=Sy2fzU9gl): VAE with Scaled Loss
+ [DFC-VAE](https://arxiv.org/abs/1610.00291): Deep Feature Consistent VAE
+ [DIP-VAE](https://arxiv.org/abs/1711.00848): Disentangled Inferred Prior VAE
+ [InfoVAE](https://arxiv.org/abs/1706.02262): Information Maximizing VAE
+ [BetaTCVAE](https://arxiv.org/abs/1802.04942): Total Correlation VAE
- **Weakly Supervised**:
+ [Ada-GVAE](https://arxiv.org/abs/2002.02886) *`AdaVae(..., average_mode='gvae')`* Usually better than the Ada-ML-VAE
+ [Ada-ML-VAE](https://arxiv.org/abs/2002.02886) *`AdaVae(..., average_mode='ml-vae')`*
+ [Ada-GVAE](https://arxiv.org/abs/2002.02886): Adaptive GVAE, *`AdaVae.cfg(average_mode='gvae')`*, usually better than below!
+ [Ada-ML-VAE](https://arxiv.org/abs/2002.02886): Adaptive ML-VAE, *`AdaVae.cfg(average_mode='ml-vae')`*
- **Supervised**:
+ [TVAE](https://arxiv.org/abs/1802.04403)

Many popular disentanglement frameworks still need to be added, please
submit an issue if you have a request for an additional framework.
+ <u>TAE</u>: _Triplet Auto-Encoder_
+ [TVAE](https://arxiv.org/abs/1802.04403): Triplet Variational Auto-Encoder

<details><summary><b>todo</b></summary><p>
<details><summary><b>🏗 Todo</b>: <i>Many popular disentanglement frameworks still need to be added, please
submit an issue if you have a request for an additional framework.</i></summary><p>

+ FactorVAE
+ GroupVAE
Expand All @@ -155,50 +185,24 @@ submit an issue if you have a request for an additional framework.
</p></details>

### Metrics
Various metrics are provided by disent that can be used to evaluate the
learnt representations of models that have been trained on ground-truth data.

- **Disentanglement**:
+ [FactorVAE Score](https://arxiv.org/abs/1802.05983)
+ [DCI](https://openreview.net/forum?id=By-7dz-AZ)
+ [MIG](https://arxiv.org/abs/1802.04942)
+ [SAP](https://arxiv.org/abs/1711.00848)
+ [Unsupervised Scores](https://github.com/google-research/disentanglement_lib)

Some popular metrics still need to be added, please submit an issue if you wish to
add your own, or you have a request.

<details><summary><b>todo</b></summary><p>
<details><summary><b>🏗 Todo</b>: <i>Some popular metrics still need to be added, please submit an issue if you wish to
add your own, or you have a request.</i></summary><p>

+ [DCIMIG](https://arxiv.org/abs/1910.05587)
+ [Modularity and Explicitness](https://arxiv.org/abs/1802.05312)

</p></details>

### Datasets

Various common datasets used in disentanglement research are included, with hash
verification and automatic chunk-size optimization of underlying hdf5 formats for
low-memory disk-based access.

- **Ground Truth**:
+ Cars3D
+ dSprites
+ MPI3D
+ SmallNORB
+ Shapes3D

- **Ground Truth Synthetic**:
+ 🧵 XYObject: *A simplistic version of dSprites with a single square.*
+ 🧵 XYObjectShaded: *Exact same dataset as XYObject, but ground truth factors have a different representation*
+ 🧵 DSpritesImagenet: *Version of DSprite with foreground or background deterministically masked out with tiny-imagenet data*

<p align="center">
<img width="384" src="docs/img/xy-object-traversal.png" alt="XYObject Dataset Factor Traversals">
</p>

#### Input Transforms + Input/Target Augmentations

- Input based transforms are supported.
- Input and Target CPU and GPU based augmentations are supported.

### Schedules & Annealing

Hyper-parameter annealing is supported through the use of schedules.
Expand All @@ -211,6 +215,43 @@ The currently implemented schedules include:

----------------------

## Architecture

The disent module structure:

- `disent.dataset`: dataset wrappers, datasets & sampling strategies
+ `disent.dataset.data`: raw datasets
+ `disent.dataset.sampling`: sampling strategies for `DisentDataset` when multiple elements are required by frameworks, eg. for triplet loss
+ `disent.dataset.transform`: common data transforms and augmentations
+ `disent.dataset.wrapper`: wrapped datasets are no longer ground-truth datasets, these may have some elements masked out. We can still unwrap these classes to obtain the original datasets for benchmarking.
- `disent.frameworks`: frameworks, including Auto-Encoders and VAEs
+ `disent.frameworks.ae`: Auto-Encoder based frameworks
+ `disent.frameworks.vae`: Variational Auto-Encoder based frameworks
- `disent.metrics`: metrics for evaluating disentanglement using ground truth datasets
- `disent.model`: common encoder and decoder models used for VAE research
- `disent.nn`: torch components for building models including layers, transforms, losses and general maths
- `disent.schedule`: annealing schedules that can be registered to a framework
- `disent.util`: helper classes, functions, callbacks, anything unrelated to a pytorch system/model/framework.

**⚠️ The API Is _Mostly_ Stable ⚠️**

Disent is still under development. Features and APIs are subject to change!
However, I will try and minimise the impact of these.

A small suite of tests currently exist which will be expanded upon in time.

**Hydra Experiment Directories**

Easily run experiments with hydra config, these files
are not available from `pip install`.

- `experiment/run.py`: entrypoint for running basic experiments with [hydra](https://github.com/facebookresearch/hydra) config
- `experiment/config/config.yaml`: main configuration file, this is probably what you want to edit!
- `experiment/config`: root folder for [hydra](https://github.com/facebookresearch/hydra) config files
- `experiment/util`: various helper code for experiments

----------------------

## Examples

### Python Example
Expand Down Expand Up @@ -357,7 +398,7 @@ visualisations of latent traversals.

### Why?

- Created as part of my Computer Science MSc scheduled for completion in 2021.
- Created as part of my Computer Science MSc which ended early 2022.
- I needed custom high quality implementations of various VAE's.
- A pytorch version of [disentanglement_lib](https://github.com/google-research/disentanglement_lib).
- I didn't have time to wait for [Weakly-Supervised Disentanglement Without Compromises](https://arxiv.org/abs/2002.02886) to release
Expand Down
1 change: 1 addition & 0 deletions disent/dataset/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,4 @@

# wrapper
from disent.dataset._base import DisentDataset
from disent.dataset._base import DisentIterDataset
Loading

0 comments on commit 392934d

Please sign in to comment.