Skip to content

Commit

Permalink
Merge branch 'CIntellifusion:main' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
timechess authored Jun 6, 2024
2 parents 7103311 + 604357b commit 42fcb7c
Show file tree
Hide file tree
Showing 8 changed files with 47 additions and 14 deletions.
7 changes: 0 additions & 7 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,2 @@

LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
LatentDiffusion/models/__pycache__/ae_module.cpython-38.pyc
LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
/LatentDiffusion/data/__pycache__
*.pyc
Binary file modified LatentDiffusion/data/__pycache__/data_wrapper.cpython-38.pyc
Binary file not shown.
Binary file modified LatentDiffusion/models/__pycache__/unet.cpython-38.pyc
Binary file not shown.
50 changes: 45 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Compare the result with papers.

- [x] Implement a DDPM and DDIM

- [x] Training On Mnist
- [x] Training result On Mnist

![mnist_200epoch](README/mnist_200epoch.gif)

Expand All @@ -27,6 +27,31 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`



## 1.1 Train on mnist

```
cd UnconditionalDiffusion
python main.py --train --dataset mnist --batch_size=128 --imsize=32
```

## 1.2 Generation on mnist

```
# modify ckpt path in the ~ line 785
python main.py
```



## 1.3 Train on Celeba

```
cd UnconditionalDiffusion
python main.py --train ---batch_size=128 --imsize=32
```



## Engineer Issues

### 关于工程抽象程度和实现设计的说明
Expand Down Expand Up @@ -64,6 +89,8 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`
- [x] train on mnist
- [x] large scalable ae module
- [x] attention in ae
- [ ] config for training
- [ ] lpips and discriminator loss
- [ ] diffusion on other latent space: text , audio , mesh , etc.

## VAE results
Expand All @@ -76,12 +103,14 @@ for further details please refer to `./UnoncditionalDiffusion/experiments.pptx`



![vae_half_tiny](README/vae_half_tiny.gif)![vae_half_tiny](README/vae_half_tiny.gif)![tiny_epoch40](README/tiny_epoch40.png)
![vae_half_tiny](README/vae_tiny.gif)![vae_half_tiny](README/vae_half_tiny.gif)![tiny_epoch40](README/tiny_epoch40.png)

## Full results of ldm on celeb

![unetlarge_celeb](README/unetlarge_celeb.png)

![unet_large](README/unet_large.gif)![unet_mid](README/unet_mid.gif)

## Implementation Plan

在VAE和UnconditionalDiffusion的代码中,`DataModule``Trainer`类重复度是非常高的。 在开发的过程中,用工程文件的形式可以保证代码的一致性。 但是会增加代码的复杂度。
Expand All @@ -94,28 +123,39 @@ VAE的部分follow其他实现,写成first_stage_condition。





- 这个VAE结果还不够好,对于高频特征的重建比较差,需要加LPIPS和Discrimintor。
- config等配置问题还需要更加规范 代码还需要整理



# Task 3 Conditional Diffusion

- [ ] classifier guidance and classifier-free guidance
- [ ] pretrained text model for condition
- [ ] different condition type: vanilla , token , cross attention etc.
- [ ] multiple condition : zero-conv(controlnet)



## Implementation Plan

[classifier guidance simple tutorial](https://zhuanlan.zhihu.com/p/639548962)

- [ ] mnist的数据集可以做分类标签-扩展数据集部分的代码
- [ ] text2image的数据集
- [ ]

# Task 4 Diffusion Transformer

- [ ] replace Unet with a transformer
- [ ]



# Task 5 Video Diffusion
# Task 5 Video Diffusion

- [ ] insert temporal layer into diffusion
- [ ] temporal and spatial attention



Expand Down
Binary file added README/unet_large.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added README/unet_mid.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 2 additions & 2 deletions UnconditionalDiffusion/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -737,7 +737,7 @@ def parse_args():
data_module = CelebDataModule(batch_size=args.batch_size,
num_workers=args.num_workers,imsize=args.imsize)
elif dataset=="mnist":
data_module = MNISTDataModule(data_dir="/home/haoyu/research/simplemodels/data",
data_module = MNISTDataModule(data_dir="path/to/mnist/data",
batch_size=args.batch_size,num_workers=args.num_workers)
elif dataset=="image":
data_module = ImageDataModule(data_dir=args.datapath,
Expand Down Expand Up @@ -782,10 +782,10 @@ def parse_args():

trainer.fit(model,data_module,ckpt_path = pretrain_path)
else:
# you may modify your checkpoint path below
ckpt_folder = f"./checkpoints/{expname}"
paths = os.listdir(ckpt_folder)
paths = [os.path.join(ckpt_folder,i) for i in paths]
paths = ["/home/haoyu/research/simplemodels/SimpleDiffusion/UnconditionalDiffusion/checkpoints/linear_normal/model-epoch=1184-val_loss=0.00332.ckpt"]
for path in paths:
ckpt = os.path.basename(path).replace(".ckpt","")
model = LightningImageDenoiser(
Expand Down
Binary file modified experiment.pptx
Binary file not shown.

0 comments on commit 42fcb7c

Please sign in to comment.