Merge branch 'dev' into main

nmichlo · Mar 31, 2022 · 9c7fe40 · 9c7fe40
2 parents 5695747 + 392934d
commit 9c7fe40
Show file tree

Hide file tree

Showing 148 changed files with 5,666 additions and 2,813 deletions.
diff --git a/README.md b/README.md
@@ -7,13 +7,13 @@
 </p>
 
 <p align="center">
-    <a href="https://choosealicense.com/licenses/mit/">
+    <a href="https://choosealicense.com/licenses/mit/" target="_blank">
         <img alt="license" src="https://img.shields.io/github/license/nmichlo/disent?style=flat-square&color=lightgrey"/>
     </a>
-    <a href="https://pypi.org/project/disent">
+    <a href="https://pypi.org/project/disent" target="_blank">
         <img alt="python versions" src="https://img.shields.io/pypi/pyversions/disent?style=flat-square"/>
     </a>
-    <a href="https://pypi.org/project/disent">
+    <a href="https://pypi.org/project/disent" target="_blank">
         <img alt="pypi version" src="https://img.shields.io/pypi/v/disent?style=flat-square&color=blue"/>
     </a>
     <a href="https://github.com/nmichlo/disent/actions?query=workflow%3Atest">
@@ -29,7 +29,7 @@
 
 <p align="center">
     <p align="center">
-        Visit the <a href="https://disent.dontpanic.sh/">docs</a> for more info, or browse the  <a href="https://github.com/nmichlo/disent/releases">releases</a>.
+        Visit the <a href="https://disent.dontpanic.sh/" target="_blank">docs</a> for more info, or browse the  <a href="https://github.com/nmichlo/disent/releases">releases</a>.
     </p>
     <p align="center">
         <a href="https://github.com/nmichlo/disent/issues/new/choose">Contributions</a> are welcome!
@@ -42,10 +42,11 @@
 
 - [Overview](#overview)
 - [Features](#features)
+    * [Datasets](#datasets)
     * [Frameworks](#frameworks)
     * [Metrics](#metrics)
-    * [Datasets](#datasets)
     * [Schedules & Annealing](#schedules--annealing)
+- [Architecture](#architecture)
 - [Examples](#examples)
     * [Python Example](#python-example)
     * [Hydra Config Example](#hydra-config-example)
@@ -88,65 +89,94 @@ Please use the following citation if you use Disent in your own research:
 
 ----------------------
 
-## Architecture
-
-The disent module structure:
+## Features
 
-- `disent.dataset`: dataset wrappers, datasets & sampling strategies
-    + `disent.dataset.data`: raw datasets
-    + `disent.dataset.sampling`: sampling strategies for `DisentDataset` when multiple elements are required by frameworks, eg. for triplet loss
-    + `disent.dataset.transform`: common data transforms and augmentations
-    + `disent.dataset.wrapper`: wrapped datasets are no longer ground-truth datasets, these may have some elements masked out. We can still unwrap these classes to obtain the original datasets for benchmarking.
-- `disent.frameworks`: frameworks, including Auto-Encoders and VAEs
-    + `disent.frameworks.ae`: Auto-Encoder based frameworks
-    + `disent.frameworks.vae`: Variational Auto-Encoder based frameworks
-- `disent.metrics`: metrics for evaluating disentanglement using ground truth datasets
-- `disent.model`: common encoder and decoder models used for VAE research
-- `disent.nn`: torch components for building models including layers, transforms, losses and general maths
-- `disent.schedule`: annealing schedules that can be registered to a framework
-- `disent.util`: helper classes, functions, callbacks, anything unrelated to a pytorch system/model/framework.
+Disent includes implementations of modules, metrics and
+datasets from various papers.
 
-**Please Note The API Is Still Unstable ⚠️**
+_Note that "🧵" means that the dataset, framework or metric was introduced by disent!_
 
-Disent is still under active development. Features and APIs are mostly stable but may change! A limited
-set of tests currently exist which will be expanded upon in time.
+### Datasets
 
-**Hydra Experiment Directories**
+Various common datasets used in disentanglement research are included, with hash
+verification and automatic chunk-size optimization of underlying hdf5 formats for
+low-memory disk-based access.
 
-Easily run experiments with hydra config, these files
-are not available from `pip install`.
+Data input and target dataset augmentations and transforms are supported, as well as augmentations
+on the GPU or CPU at different points in the pipeline.
 
-- `experiment/run.py`: entrypoint for running basic experiments with [hydra](https://github.com/facebookresearch/hydra) config
-- `experiment/config/config.yaml`: main configuration file, this is probably what you want to edit!
-- `experiment/config`: root folder for [hydra](https://github.com/facebookresearch/hydra) config files
-- `experiment/util`: various helper code for experiments
+- **Ground Truth**:
+  + <details>
+    <summary>🚗 <a href="https://papers.nips.cc/paper/5845-deep-visual-analogy-making" target="_blank">Cars3D</a></summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__cars3d.jpg" alt="Cars3D Dataset Factor Traversals"></p>
+  </details>
+
+  + <details>
+    <summary>◻️ <a href="https://github.com/deepmind/dsprites-dataset" target="_blank">dSprites</a></summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__dsprites.jpg" alt="dSprites Dataset Factor Traversals"></p>
+  </details>
+
+  + <details>
+    <summary>🔺 <a href="https://arxiv.org/abs/1906.03292" target="_blank">MPI3D</a></summary>
+    <p align="center">🏗 Todo</p>
+  </details>
+
+  + <details>
+    <summary>🐘 <a href="https://cs.nyu.edu/~ylclab/data/norb-v1.0-small/" target="_blank">SmallNORB</a></summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__smallnorb.jpg" alt="Small Norb Dataset Factor Traversals"></p>
+  </details>
+
+  + <details>
+    <summary>🌈 <a href="https://github.com/deepmind/3d-shapes" target="_blank">Shapes3D</a></summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__shapes3d.jpg" alt="Shapes3D Dataset Factor Traversals"></p>
+  </details>
+
+  + <details open>
+    <summary>
+      🧵 <u>dSpritesImagenet</u>:
+      <i>Version of DSprite with foreground or background deterministically masked out with tiny-imagenet data.</i>
+    </summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__dsprites-imagenet-bg-100.jpg" alt="dSpritesImagenet Dataset Factor Traversals"></p>
+  </details>
 
-----------------------
+- **Ground Truth Synthetic**:
+  + <details>
+    <summary>
+      🧵 <u>XYObject</u>:
+      <i>A simplistic version of dSprites with a single square.</i>
+    </summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__xy-object.jpg" alt="XYObject Dataset Factor Traversals"></p>
+  </details>
+
+  + <details open>
+    <summary>
+      🧵 <u>XYObjectShaded</u>:
+      <i>Exact same dataset as XYObject, but ground truth factors have a different representation.</i>
+    </summary>
+    <p align="center"><img height="192" src="docs/img/traversals/traversal-transpose__xy-object-shaded.jpg" alt="XYObjectShaded Dataset Factor Traversals"></p>
+  </details>
 
-## Features
+### Frameworks
 
-Disent includes implementations of modules, metrics and
-datasets from various papers. Please note that items marked
-  with a "🧵" are introduced in and are unique to disent!
+Disent provides the following Auto-Encoders and Variational Auto-Encoders!
 
-### Frameworks
 - **Unsupervised**:
-  + [VAE](https://arxiv.org/abs/1312.6114)
-  + [Beta-VAE](https://openreview.net/forum?id=Sy2fzU9gl)
-  + [DFC-VAE](https://arxiv.org/abs/1610.00291)
-  + [DIP-VAE](https://arxiv.org/abs/1711.00848)
-  + [InfoVAE](https://arxiv.org/abs/1706.02262)
-  + [BetaTCVAE](https://arxiv.org/abs/1802.04942)
+  + <u>AE</u>: _Auto-Encoder_
+  + [VAE](https://arxiv.org/abs/1312.6114): Variational Auto-Encoder
+  + [Beta-VAE](https://openreview.net/forum?id=Sy2fzU9gl): VAE with Scaled Loss
+  + [DFC-VAE](https://arxiv.org/abs/1610.00291): Deep Feature Consistent VAE
+  + [DIP-VAE](https://arxiv.org/abs/1711.00848): Disentangled Inferred Prior VAE
+  + [InfoVAE](https://arxiv.org/abs/1706.02262): Information Maximizing VAE
+  + [BetaTCVAE](https://arxiv.org/abs/1802.04942): Total Correlation VAE
 - **Weakly Supervised**:
-  + [Ada-GVAE](https://arxiv.org/abs/2002.02886) *`AdaVae(..., average_mode='gvae')`* Usually better than the Ada-ML-VAE
-  + [Ada-ML-VAE](https://arxiv.org/abs/2002.02886) *`AdaVae(..., average_mode='ml-vae')`*
+  + [Ada-GVAE](https://arxiv.org/abs/2002.02886): Adaptive GVAE, *`AdaVae.cfg(average_mode='gvae')`*, usually better than below!
+  + [Ada-ML-VAE](https://arxiv.org/abs/2002.02886): Adaptive ML-VAE, *`AdaVae.cfg(average_mode='ml-vae')`*
 - **Supervised**:
-  + [TVAE](https://arxiv.org/abs/1802.04403)
-
-Many popular disentanglement frameworks still need to be added, please
-submit an issue if you have a request for an additional framework.
+  + <u>TAE</u>: _Triplet Auto-Encoder_
+  + [TVAE](https://arxiv.org/abs/1802.04403): Triplet Variational Auto-Encoder
 
-<details><summary><b>todo</b></summary><p>
+<details><summary><b>🏗 Todo</b>: <i>Many popular disentanglement frameworks still need to be added, please
+submit an issue if you have a request for an additional framework.</i></summary><p>
 
 + FactorVAE
 + GroupVAE
@@ -155,50 +185,24 @@ submit an issue if you have a request for an additional framework.
 </p></details>
 
 ### Metrics
+Various metrics are provided by disent that can be used to evaluate the
+learnt representations of models that have been trained on ground-truth data. 
+
 - **Disentanglement**:
   + [FactorVAE Score](https://arxiv.org/abs/1802.05983)
   + [DCI](https://openreview.net/forum?id=By-7dz-AZ)
   + [MIG](https://arxiv.org/abs/1802.04942)
   + [SAP](https://arxiv.org/abs/1711.00848)
   + [Unsupervised Scores](https://github.com/google-research/disentanglement_lib)
 
-Some popular metrics still need to be added, please submit an issue if you wish to
-add your own, or you have a request.
-
-<details><summary><b>todo</b></summary><p>
+<details><summary><b>🏗 Todo</b>: <i>Some popular metrics still need to be added, please submit an issue if you wish to
+add your own, or you have a request.</i></summary><p>
 
 + [DCIMIG](https://arxiv.org/abs/1910.05587)
 + [Modularity and Explicitness](https://arxiv.org/abs/1802.05312)
 
 </p></details>
 
-### Datasets
-
-Various common datasets used in disentanglement research are included, with hash
-verification and automatic chunk-size optimization of underlying hdf5 formats for
-low-memory disk-based access.
-
-- **Ground Truth**:
-  + Cars3D
-  + dSprites
-  + MPI3D
-  + SmallNORB
-  + Shapes3D
-
-- **Ground Truth Synthetic**:
-  + 🧵 XYObject: *A simplistic version of dSprites with a single square.*
-  + 🧵 XYObjectShaded: *Exact same dataset as XYObject, but ground truth factors have a different representation*
-  + 🧵 DSpritesImagenet: *Version of DSprite with foreground or background deterministically masked out with tiny-imagenet data*
-
-  <p align="center">
-    <img width="384" src="docs/img/xy-object-traversal.png" alt="XYObject Dataset Factor Traversals">
-  </p>
-
-  #### Input Transforms + Input/Target Augmentations
-
-  - Input based transforms are supported.
-  - Input and Target CPU and GPU based augmentations are supported.
-
 ### Schedules & Annealing
 
 Hyper-parameter annealing is supported through the use of schedules.
@@ -211,6 +215,43 @@ The currently implemented schedules include:
 
 ----------------------
 
+## Architecture
+
+The disent module structure:
+
+- `disent.dataset`: dataset wrappers, datasets & sampling strategies
+    + `disent.dataset.data`: raw datasets
+    + `disent.dataset.sampling`: sampling strategies for `DisentDataset` when multiple elements are required by frameworks, eg. for triplet loss
+    + `disent.dataset.transform`: common data transforms and augmentations
+    + `disent.dataset.wrapper`: wrapped datasets are no longer ground-truth datasets, these may have some elements masked out. We can still unwrap these classes to obtain the original datasets for benchmarking.
+- `disent.frameworks`: frameworks, including Auto-Encoders and VAEs
+    + `disent.frameworks.ae`: Auto-Encoder based frameworks
+    + `disent.frameworks.vae`: Variational Auto-Encoder based frameworks
+- `disent.metrics`: metrics for evaluating disentanglement using ground truth datasets
+- `disent.model`: common encoder and decoder models used for VAE research
+- `disent.nn`: torch components for building models including layers, transforms, losses and general maths
+- `disent.schedule`: annealing schedules that can be registered to a framework
+- `disent.util`: helper classes, functions, callbacks, anything unrelated to a pytorch system/model/framework.
+
+**⚠️ The API Is _Mostly_ Stable ⚠️**
+
+Disent is still under development. Features and APIs are subject to change!
+However, I will try and minimise the impact of these.
+
+A small suite of tests currently exist which will be expanded upon in time.
+
+**Hydra Experiment Directories**
+
+Easily run experiments with hydra config, these files
+are not available from `pip install`.
+
+- `experiment/run.py`: entrypoint for running basic experiments with [hydra](https://github.com/facebookresearch/hydra) config
+- `experiment/config/config.yaml`: main configuration file, this is probably what you want to edit!
+- `experiment/config`: root folder for [hydra](https://github.com/facebookresearch/hydra) config files
+- `experiment/util`: various helper code for experiments
+
+----------------------
+
 ## Examples
 
 ### Python Example
@@ -357,7 +398,7 @@ visualisations of latent traversals.
 
 ### Why?
 
-- Created as part of my Computer Science MSc scheduled for completion in 2021.
+- Created as part of my Computer Science MSc which ended early 2022.
 - I needed custom high quality implementations of various VAE's.
 - A pytorch version of [disentanglement_lib](https://github.com/google-research/disentanglement_lib).
 - I didn't have time to wait for [Weakly-Supervised Disentanglement Without Compromises](https://arxiv.org/abs/2002.02886) to release

diff --git a/disent/dataset/__init__.py b/disent/dataset/__init__.py
@@ -24,3 +24,4 @@
 
 # wrapper
 from disent.dataset._base import DisentDataset
+from disent.dataset._base import DisentIterDataset
Original file line number	Diff line number	Diff line change
Expand Up		@@ -24,3 +24,4 @@

		# wrapper
		from disent.dataset._base import DisentDataset
		from disent.dataset._base import DisentIterDataset