Delving into the Openness of CLIP
Shuhuai Ren, Lei Li, Xuancheng Ren, Guangxiang Zhao, Xu Sun
Official implementation of the paper "Delving into the Openness of CLIP".
- (Nov 25, 2022)
- Evaluation codes for Extensibility and Stability. Codes for Retrieval-Enhanced Prompt Engineering (REPE).
- The repository supports CLIP (ViT-B/16), CLIP (ViT-B/32), CLIP (RN101), CLIP (RN50), CoOp, DeCLIP (ViT-B/32), DeCLIP (RN50), SLIP (ViT-B/16), FILIP (ViT-B/32), and DeFILIP (ViT-B/32) architectures.
- Systematical Investigation for the Openness of CLIP: We design the evaluation protocol and two indicators of extensibility and stability.
- CLIP Feature Space Dissecting: We define inter-modal alignment and intra-modal uniformity, two metrics to measure the quality of representations in contrastive learning for the vision-and-language domain.
- Retrieval-enhanced prompt engineering (REPE): A simple yet effective method to improve the extensibility and stability of CLIP without fine-tuning.
For installation and other package requirements, please follow the instructions detailed in INSTALL.md.
Please follow the instructions at DATASETS.md to prepare all datasets.
Please follow the instructions at MODELS.md to prepare all pre-trained models.
Please refer to the RUN.md for detailed instructions on training, evaluating and reproducing the results.
If you use our work, please consider citing:
@article{Ren2022DelvingIT,
title={Delving into the Openness of {CLIP}},
author={Shuhuai Ren and Lei Li and Xuancheng Ren and Guangxiang Zhao and Xu Sun},
journal={ArXiv},
year={2022},
volume={abs/2206.01986}
}
If you have any questions, please create an issue on this repository or contact at [email protected].
Our code is based on CoOp, clip-retrieval, and DeCLIP repositories. We thank the authors for releasing their code. If you use our model and code, please consider citing these works as well.