Skip to content

Commit

Permalink
celeb和mnist上最简单的训练结果
Browse files Browse the repository at this point in the history
  • Loading branch information
CIntellifusion committed May 9, 2024
1 parent f3a62a6 commit ce9a888
Show file tree
Hide file tree
Showing 8 changed files with 51 additions and 14 deletions.
49 changes: 39 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,33 +3,62 @@ A simple diffusion framework for comprehensive understanding.

Carry out a series training and record the training results.

Compare the result with papers.
Compare the result with papers.



> we don’t understand what we can’t create.
# Task 1 Unconditional Diffusion

- [ ] Implement a DDPM and DDIM that can generate Celeb64Image
- [ ] Unet Scalable Unet
- [x] Implement a DDPM and DDIM

# Task 2 Conditional Diffusion
- [x] Training On Mnist

![mnist_200epoch](README/mnist_200epoch.gif)

- [x] that can generate Celeb64Image 20240509

![celeb_200epoch_32pix](UnconditionalDiffusion/training_results/celeb_200epoch_32pix.gif)

- [ ] Scalable Unet to 128 - 256 - 512



for further details please refer to `./UnoncditionalDiffusion/experiments.pptx` and `./UnoncditionalDiffusion/experiments.md`

- [ ] pretrained text model for condition
- [ ] different condition type: vanilla , token , cross attention etc.
- [ ] multiple condition : zero-conv(controlnet)


## Engineer Issues

# Task3 Latent Diffusion
在VAE和UnconditionalDiffusion的代码中,`DataModule``Trainer`类重复度是非常高的。 在开发的过程中,用工程文件的形式可以保证代码的一致性。 但是会增加代码的复杂度。



第二点: 可以利用`importlib` +`omega_conf` 来简化类初始化。 参考VideoCrafter的实现。



# Task 2 Latent Diffusion

- [ ] vae for compression
- [x] train on mnist

- [ ] diffusion on other latent space: text , audio , mesh , etc.

# Task4 Diffusion Transformer
# Task 3Conditional Diffusion

- [ ] pretrained text model for condition
- [ ] different condition type: vanilla , token , cross attention etc.
- [ ] multiple condition : zero-conv(controlnet)

# Task 4 Diffusion Transformer

- [ ] replace Unet with a transformer



# Task5 Video Diffusion
# Task 5 Video Diffusion

- [ ] insert temporal layer into diffusion

Expand Down
Binary file added README/celeb_200epoch_32pix.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added README/mnist_200epoch.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified UnconditionalDiffusion/experiment.pptx
Binary file not shown.
16 changes: 12 additions & 4 deletions UnconditionalDiffusion/experiments.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
# Unconditional Diffusion

## 1 Dataset

Celeb64 example:

# Debug


Expand All @@ -30,6 +26,12 @@ lambda: x: (x-0.5)/2



现在只用ToTensor做预处理就足够





## 学习率调整

应该先用不调整学习率方法进行训练
Expand All @@ -40,5 +42,11 @@ lambda: x: (x-0.5)/2



损失函数的绝对值和采样方法有很大的关系。 扩散模型中,模型是在拟合一个条件概率分布。 如果前向的概率分布错了那么拟合的损失也会受影响。



损失函数的收敛速度也和采样方法有很大的关系。



Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ce9a888

Please sign in to comment.