This repository contains PyTorch Implementation of ICLR 2021 paper: Learnable Embedding Sizes for Recommender Systems. Please check our paper for more details about our work if you are interested.
Following the steps below to run our codes:
pip install torchfm
For more information about torchfm, please see:
https://github.com/rixwew/pytorch-fm
We provide MovieLens-1M dataset in data/ml-1m
. If you want to run PEP on Criteo and Avazu datasets,
you need to download the dataset at Criteo and Avazu.
Raw data should be stored with the following file directory:
data/criteo/train.txt
data/avazu/train
For learning embedding sizes, the hyper-parameters are in train_[dataset].py
For retraining learned embedding sizes, the hyper-parameters are in train_[dataset]_retrain.py
Run train_[dataset].py
to learn embedding sizes. Learned embedding will be saved in
tmp/embedding/fm/[alias]/
, named as number of parameters.
Run train_[dataset]_retrain.py
to retrain the pruned embedding table. You need to specify what embedding table need to be retrain by hyper-parameter retrain_emb_param
.
- Python 3
- PyTorch 1.1.0
If you find this repo is useful for you, please kindly cite our paper.
@inproceedings{liu2021learnable,
title={Learnable Embedding Sizes for Recommender Systems},
author={Siyi Liu and Chen Gao and Yihong Chen and Depeng Jin and Yong Li},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=vQzcqQWIS0q}
}
The structure of this code is largely based on lambda-opt.