Recent studies indicate that the denoising process in deep generative diffusion models implicitly learns and memorizes semantic information from the data distribution. These findings suggest that capturing more complex data distributions requires larger neural networks, leading to a substantial increase in computational demands, which in turn become the primary bottleneck in both training and inference of diffusion models.
To this end, we introduce Generative Modeling with Explicit Memory GMem, leveraging an external memory bank in both training and sampling phases of diffusion models. This approach preserves semantic information from data distributions, reducing reliance on neural network capacity for learning and generalizing across diverse datasets. The results are significant: our GMem enhances both training, sampling efficiency, and generation quality. For instance, on ImageNet at
-
Python and PyTorch:
- 64-bit Python 3.10 or later.
- PyTorch 2.4.0 or later (earlier versions might work but are not guaranteed).
-
Additional Python Libraries:
- A complete list of required libraries is provided in the requirements.txt file.
- To install them, execute the following command:
pip install -r requirements.txt
To reproduce the results from the paper, run the following script:
bash scripts/sample-gmem-xl.sh
Important: make sure to change --ckpt
to correct path.
We offer the following pre-trained model and memory bank here:
Backbone | Training Steps | Dataset | Bank Size | Training Epo. | Download |
---|---|---|---|---|---|
SiT-XL/2 | 2M | ImageNet |
640,000 | 5 | Huggingface |
- Up next: the training code and scripts for GMem.
If you find this repository helpful for your project, please consider citing our work:
@article{tang2024generative,
title={Generative Modeling with Explicit Memory},
author={Tang, Yi and Sun, Peng and Cheng, Zhenglin and Lin, Tao},
journal={arXiv preprint arXiv:2412.08781},
year={2024}
}
This code is mainly built upon SiT, edm2, and REPA repositories.