Skip to content

TensorFlow implementation of SwinT-ChARM (Transformer-Based Transform Coding, ICLR 2022)

License

Notifications You must be signed in to change notification settings

Nikolai10/SwinT-ChARM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SwinT-ChARM (TensorFlow 2)

This repository provides a TensorFlow implementation of SwinT-ChARM based on:

SwinT-ChARM net arch Source

Updates

10/06/2023

  1. LIC-TCM (TensorFlow 2) is now available: https://github.com/Nikolai10/LIC-TCM (Liu et al. CVPR 2023 Highlight).

09/06/2023

  1. The high quality of this reimplementation has been confirmed in EGIC, Section A.8..

10/09/2022

  1. The number of model parameters now corresponds exactly to the reported number (32.6 million). We thank the authors for providing us with the official DeepSpeed log files.
  2. SwinT-ChARM now supports compression at different input resolutions (multiples of 256).
  3. We release a pre-trained model as proof of functional correctness.

08/17/2022

  1. Initial release of this project (see branch release_08/17/2022)

Acknowledgment

This project is based on:

Note that this repository builds upon the official TF implementation of Minnen et al., while Zhu et al. base their work on an unknown (possibly not publicly available) PyTorch reimplementation.

Examples

The samples below are taken from the Kodak dataset, external to the training set:

Original SwinT-ChARM (β = 0.0003)
kodim22.png kodim22_hat.png
Mean squared error: 13.7772
PSNR (dB): 36.74
Multiscale SSIM: 0.9871
Multiscale SSIM (dB): 18.88
Bits per pixel: 0.9890
Original SwinT-ChARM (β = 0.0003)
kodim23.png kodim23_hat.png
Mean squared error: 7.1963
PSNR (dB): 39.56
Multiscale SSIM: 0.9903
Multiscale SSIM (dB): 20.13
Bits per pixel: 0.3953
Original SwinT-ChARM (β = 0.0003)
kodim15.png kodim15_hat.png
Mean squared error: 10.1494
PSNR (dB): 38.07
Multiscale SSIM: 0.9888
Multiscale SSIM (dB): 19.49
Bits per pixel: 0.6525

More examples can be found here.

Pretrained Models/ Performance (TFC 2.8)

Our pre-trained model (β = 0.0003) achieves a PSNR of 37.59 (db) using an average of 0.93 bpp on the Kodak dataset, which is very close to the reported numbers (see paper, Figure 3). Worth mentioning: we achieve this result despite training our model from scratch and using less than one-third of the computational resources (1M optimization steps).

Lagrangian multiplier (β) SavedModel Training Instructions
0.0003 download
!python SwinT-ChARM/zyc2022.py -V --model_path <...> train --max_support_slices 10 --lambda 0.0003 --epochs 1000 --batchsize 16 --train_path <...>

File Structure

 res
     ├── doc/                                       # addtional resources
     ├── eval/                                      # sample images + reconstructions
     ├── train_zyc2022/                             # model checkpoints + tf.summaries
     ├── zyc2022/                                   # saved model
 swin-transformers-tf/                              # extended swin-transformers-tf implementation 
     ├── changelog.txt                              # summary of changes made to the original work
     ├── ...  
 config.py                                          # model-dependent configurations
 zyc2022.py                                         # core of this repo

License

Apache License 2.0

About

TensorFlow implementation of SwinT-ChARM (Transformer-Based Transform Coding, ICLR 2022)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published