Author: Li Heyuan (李贺元)
Email: [email protected]
Inspired by dkozlov/awesome-knowledge-distillation
All rights reserved
If you have any suggestion or want to recommend new papers, please feel free to let me know.
I have read all the papers here, and am very happy to discuss with you if you have any questions on these papers.
I will keep updating this project frequently.
- Model Compression, Rich Caruana, 2006
- Distilling the Knowledge in a Neural Network, Hinton, J.Dean, 2015
- Knowledge Acquisition from Examples Via Multiple Models, Perdo Domingos, 1997
- Combining labeled and unlabeled data with co-training, A. Blum, T. Mitchell, 1998
- Using A Neural Network to Approximate An Ensemble of Classifiers, Xinchuan Zeng and Tony R. Martinez, 2000
- Do Deep Nets Really Need to be Deep?, Lei Jimmy Ba, Rich Caruana, 2014
- FitNets: Hints for Thin Deep Nets, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
- Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer, Sergey Zagoruyko, Nikos Komodakis, 2016
- A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
- Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang
- Born Again Neural Networks, Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar, 2018
- Net2Net: Accelerating Learning Via Knowledge Transfer, Tianqi Chen, Ian Goodfellow, Jonathon Shlens, 2016
- Unifying distillation and privileged information, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
- Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
- Large scale distributed neural network training through online distillation, Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton, 2018
- Deep Mutual Learning, Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu, 2017
- Learning Loss for Knowledge Distillation with Conditional Adversarial Networks, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
- Quantization Mimic: Towards Very Tiny CNN for Object Detection, Yi Wei, Xinyu Pan, Hongwei Qin, Wanli Ouyang, Junjie Yan, 2018
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Moonshine: Distilling with Cheap Convolutions, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
- Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving, Jiaolong Xu, Peng Wang, Heng Yang and Antonio M. L ´opez, 2018
- Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
- Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher, Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Hassan Ghasemzadeh, 2019
- Learning Global Additive Explanations for Neural Nets Using Model Distillation, Sarah Tan, Rich Caruana, Giles Hooker, Paul Koch, Albert Gordo, 2018
- YASENN: Explaining Neural Networks via Partitioning Activation Sequences, Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin, 2018
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results, Antti Tarvainen, Harri Valpola, 2018
- Local Affine Approximators for Improving Knowledge Transfer, Suraj Srinivas & François Fleuret, 2018
- Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?Shilin Zhu, Xin Dong, Hao Su, 2018
- Learning Efficient Detector with Semi-supervised Adaptive Distillation, Shitao Tang, Litong Feng, Zhanghui Kuang, Wenqi Shao, Quanquan Li, Wei Zhang, Yimin Chen, 2019
- Dataset Distillation, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros, 2019
- Relational Knowledge Distillation, Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho, 2019
- Knowledge Adaptation for Efficient Semantic Segmentation, Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, Youliang Yan, 2019
- A Comprehensive Overhaul of Feature Distillation, Byeongho Heo, Jeesoo Kim, Sangdoo Yun, Hyojin Park, Nojun Kwak, Jin Young Choi, 2019
- Learning Efficient Object Detection Models with Knowledge Distillation, Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, Manmohan Chandraker, NIPS 2017
- Data Distillation: Towards Omni-Supervised Learning, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, CVPR 2017
- Cross Modal Distillation for Supervision Transfer, Saurabh Gupta, Judy Hoffman, Jitendra Malik, CVPR 2016
- Knowledge Projection for Deep Neural Networks, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
- Like What You Like: Knowledge Distill via Neuron Selectivity Transfer, Zehao Huang, Naiyan Wang, 2017
- Deep Model Compression: Distilling Knowledge from Noisy Teachers, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
- Knowledge Distillation for Small-footprint Highway Networks, Liang Lu, Michelle Guo, Steve Renals, 2016
- Sequence-Level Knowledge Distillation, deeplearning-papernotes, Yoon Kim, Alexander M. Rush, 2016
- Recurrent Neural Network Training with Dark Knowledge Transfer, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
- Data-Free Knowledge Distillation For Deep Neural Networks, Raphael Gontijo Lopes, Stefano Fenu, 2017
- DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
- Face Model Compression by Distilling Knowledge from Neurons, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang, and Xiaoou Tang, 2016
- Adapting Models to Signal Degradation using Distillation, Jong-Chyi Su, Subhransu Maji, BMVC 2017