CompressAI (compress-ay) is a PyTorch library and evaluation platform for end-to-end compression research.
The latest AI-intra-encoding (Elic2022Chandelier
) can perform better than inter-encoding in CH5. This model is defined in compressai/models/sensetime.py
.
Also, I trained the AI-inter-encoding (ScaleSpaceFlow
), which applies a similar work-flow with CH5. This model is defined in compressai/models/video.google.py
. This AI-inter-encoding uses a VAE-arch (MeanScaleHyperprior
) as its AI-intra-encoding module, which is defined in compressai/models/google.py
.
This AI-inter-encoding (ScaleSpaceFlow
) can be optimized by applying other better intra-encoding modules, such as JointAutoregressiveHierarchicalPriors
, Elic2022Chandelier
. However, training for those models is much more time-consuming than using MeanScaleHyperprior
. Both JointAutoregressiveHierarchicalPriors
and Elic2022Chandelier
applies a sequencial encoding in their context-models, which could cost too much time on RTX3060 if I applied them into the ScaleSpaceFlow
. So, I choose to try these two models, MeanScaleHyperprior
and ScaleSpaceFlow
, as a comparison with CH4 (intra-encoding) and CH5 (inter-encoding). Even using the fastest VAE-arch intra-encoding (MeanScaleHyperprior
) costs about 2 weeks for training to get 7 different RD-points, which implies the practical diffculty when apply other sophisticated intra-encoding models in the inter-encoding model.
The model weights can be found at my google drive: https://drive.google.com/drive/folders/1tvloIxHN4Mt39ZSnDwufKbpv8H1z8QeH?usp=drive_link
CompressAI supports python 3.6+ and PyTorch 1.7+.
From source:
A C++17 compiler, a recent version of pip (19.0+), and common python packages
are also required (see setup.py
for the full list).
git clone https://github.com/CristinaZN/CompressAI.git
cd CompressAI
pip install -U pip && pip install -e .
dataset(CLIC):
dataset for training can be obtained from https://clic.compression.cc/2021/tasks/index.html. After downloading the dataset, please arrange them in this style for training:
--dataset
----train
----val
----test
To train the model:
cd CompressAI
python3 examples/train.py -m [model_name] -d ../dataset/ -e 200 -lr 1e-4 -n 4 --lambda 0.0125 --batch-size 4 --test-batch-size 4 --cuda --checkpoint-name ./lambda_0.0125
Note: lambda is setting for balancing bpp and mse, which is chosen from [0.4, 0.2, 0.1, 0.05, 0.025, 0.0125, 0.00675].
cd CompressAI
python3 -m compressai.utils.eval_model checkpoint ../Chapter6_Template/foreman20_40_RGB/ -a [model_name] --cuda -m mse -d ./ -o [any_name] -p [model_weight.pth]
dataset (vimeo-90K):
dataset for training can be obtained from: http://toflow.csail.mit.edu/. (the Triplet dataset). After downloading the dataset, please arrange them in this style for training:
--datset
----sequences
------00001
--------img1
--------img2
--------img3
------00002
--------img1
--------img2
--------img3
...
----test.list (text file)
----train.list (text file)
python3 examples/train_video.py -m ssf2020 -d [vimeo_triplet] -e 200 -lr 1e-4 -n 4 --lambda 0.4 --batch-size 4 --test-batch-size 4 --cuda --save --checkpoint-name ./lambda_0.4
Note: lambda is setting for balancing bpp and mse, which is chosen from [0.4, 0.2, 0.1, 0.05, 0.025, 0.0125, 0.00675].
cd CompressAI
python3 -m compressai.utils.video.eval_model checkpoint /Chapter6_Template/foreman20_40_RGB -p ./[model_weight.pth] --cuda -a ssf2020 ./
CompressAI is licensed under the BSD 3-Clause Clear License
We welcome feedback and contributions. Please open a GitHub issue to report bugs, request enhancements or if you have any questions.
Before contributing, please read the CONTRIBUTING.md file.
- Jean Bégaint, Fabien Racapé, Simon Feltman and Hyomin Choi, InterDigital AI Lab.
If you use this project, please cite the relevant original publications for the models and datasets, and cite this project as:
@article{begaint2020compressai,
title={CompressAI: a PyTorch library and evaluation platform for end-to-end compression research},
author={B{\'e}gaint, Jean and Racap{\'e}, Fabien and Feltman, Simon and Pushparaja, Akshay},
year={2020},
journal={arXiv preprint arXiv:2011.03029},
}
- Tensorflow compression library by Ballé et al.: https://github.com/tensorflow/compression
- Range Asymmetric Numeral System code from Fabian 'ryg' Giesen: https://github.com/rygorous/ryg_rans
- BPG image format by Fabrice Bellard: https://bellard.org/bpg
- HEVC HM reference software: https://hevc.hhi.fraunhofer.de
- VVC VTM reference software: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM
- AOM AV1 reference software: https://aomedia.googlesource.com/aom
- Z. Cheng et al. 2020: https://github.com/ZhengxueCheng/Learned-Image-Compression-with-GMM-and-Attention
- Kodak image dataset: http://r0k.us/graphics/kodak/