Merge branch 'CIntellifusion:main' into main

CIntellifusion · Jun 6, 2024 · 42fcb7c · 42fcb7c
2 parents 7103311 + 604357b
commit 42fcb7c
Show file tree

Hide file tree

Showing 8 changed files with 47 additions and 14 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,9 +1,2 @@
-
-LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
-LatentDiffusion/models/__pycache__/ae_module.cpython-38.pyc
-LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
-LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
-LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
-LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
 /LatentDiffusion/data/__pycache__
 *.pyc
diff --git a/LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc b/LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
diff --git a/LatentDiffusion/models/__pycache__/unet.cpython-38.pyc b/LatentDiffusion/models/__pycache__/unet.cpython-38.pyc
diff --git a/README.md b/README.md
@@ -13,7 +13,7 @@ Compare the result with papers.
 
 - [x] Implement a DDPM and DDIM
 
-- [x] Training On Mnist 
+- [x] Training result On Mnist 
 
 ![mnist_200epoch](README/mnist_200epoch.gif)
 
@@ -27,6 +27,31 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`
 
 
 
+## 1.1 Train on mnist 
+
+```
+cd UnconditionalDiffusion
+python main.py --train --dataset mnist --batch_size=128 --imsize=32
+```
+
+## 1.2 Generation on mnist
+
+```
+# modify ckpt path in the ~ line 785
+python main.py 
+```
+
+
+
+## 1.3 Train on Celeba
+
+```
+cd UnconditionalDiffusion
+python main.py --train ---batch_size=128 --imsize=32
+```
+
+
+
 ## Engineer Issues
 
 ### 关于工程抽象程度和实现设计的说明
@@ -64,6 +89,8 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`
   - [x] train on mnist 
   - [x] large scalable ae module
   - [x] attention in ae 
+  - [ ] config for training 
+  - [ ] lpips and discriminator loss
 - [ ] diffusion on other latent space: text , audio , mesh , etc.
 
 ## VAE results 
@@ -76,12 +103,14 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`
 
 
 
-![vae_half_tiny](README/vae_half_tiny.gif)![vae_half_tiny](README/vae_half_tiny.gif)![tiny_epoch40](README/tiny_epoch40.png)
+![vae_half_tiny](README/vae_tiny.gif)![vae_half_tiny](README/vae_half_tiny.gif)![tiny_epoch40](README/tiny_epoch40.png)
 
 ## Full results of ldm on celeb
 
 ![unetlarge_celeb](README/unetlarge_celeb.png)
 
+![unet_large](README/unet_large.gif)![unet_mid](README/unet_mid.gif)
+
 ## Implementation Plan
 
 在VAE和UnconditionalDiffusion的代码中，`DataModule`和`Trainer`类重复度是非常高的。 在开发的过程中，用工程文件的形式可以保证代码的一致性。 但是会增加代码的复杂度。 
@@ -94,28 +123,39 @@ VAE的部分follow其他实现，写成first_stage_condition。
 
 
 
-
-
 - 这个VAE结果还不够好，对于高频特征的重建比较差，需要加LPIPS和Discrimintor。
 - config等配置问题还需要更加规范  代码还需要整理
 
 
 
 # Task 3 Conditional Diffusion
 
+- [ ] classifier guidance and classifier-free guidance 
 - [ ] pretrained text model for condition
 - [ ] different condition type: vanilla , token , cross attention etc. 
 - [ ] multiple condition : zero-conv(controlnet)
 
+
+
+## Implementation Plan
+
+[classifier guidance simple tutorial](https://zhuanlan.zhihu.com/p/639548962)
+
+- [ ] mnist的数据集可以做分类标签-扩展数据集部分的代码
+- [ ] text2image的数据集
+- [ ] 
+
 # Task 4 Diffusion Transformer
 
 - [ ] replace Unet with a transformer 
+- [ ] 
 
 
 
-# Task 5 Video Diffusion
+# Task 5  Video Diffusion
 
 - [ ] insert temporal layer into diffusion 
+- [ ] temporal and spatial attention 
 
 
 

diff --git a/README/unet_large.gif b/README/unet_large.gif
diff --git a/README/unet_mid.gif b/README/unet_mid.gif
diff --git a/UnconditionalDiffusion/main.py b/UnconditionalDiffusion/main.py
@@ -737,7 +737,7 @@ def parse_args():
             data_module = CelebDataModule(batch_size=args.batch_size,
                                           num_workers=args.num_workers,imsize=args.imsize)
         elif dataset=="mnist":
-            data_module = MNISTDataModule(data_dir="/home/haoyu/research/simplemodels/data", 
+            data_module = MNISTDataModule(data_dir="path/to/mnist/data", 
                                           batch_size=args.batch_size,num_workers=args.num_workers)
         elif dataset=="image":
             data_module = ImageDataModule(data_dir=args.datapath, 
@@ -782,10 +782,10 @@ def parse_args():
 
         trainer.fit(model,data_module,ckpt_path = pretrain_path)
     else:
+        # you may modify your checkpoint path below 
         ckpt_folder = f"./checkpoints/{expname}"
         paths = os.listdir(ckpt_folder)
         paths = [os.path.join(ckpt_folder,i) for i in paths]
-        paths = ["/home/haoyu/research/simplemodels/SimpleDiffusion/UnconditionalDiffusion/checkpoints/linear_normal/model-epoch=1184-val_loss=0.00332.ckpt"]
         for path in paths:
             ckpt = os.path.basename(path).replace(".ckpt","")
             model = LightningImageDenoiser(

diff --git a/experiment.pptx b/experiment.pptx