The implementation of Multilingual Neural Machine Translation with Knowledge Distillation [ICLR2019] (Xu Tan*, Yi Ren*, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu)
This code is based on Fairseq
pip install -r requirements.txt
cd data/iwslt/raw; bash prepare-iwslt14.sh
python setup.py install
- Run
data_dir=iwslt exp_name=train_expert_LNG1 targets="LNG1" hparams=" --save-output --share-all-embeddings" bash runs/train.sh
. - Replace LNG1 with other languages to train all the experts(LNG2, LNG3, ...).
- Topk output binary files will be produced after steps 1 and 2 in $data/data-bin
- Run
exp_name=train_kd_multilingual targets="LNG1,LNG2 ...(filling with all languages)" hparams=" --share-all-embeddings" bash runs/train_distill.sh
to train the KD multilingual model. BLEU scores will be printed to console every 3 epochs.