Multilingual NMT with Knowledge Distillation on Fairseq

The implementation of Multilingual Neural Machine Translation with Knowledge Distillation [ICLR2019] (Xu Tan*, Yi Ren*, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu)

This code is based on Fairseq

Preparation

Run data_dir=iwslt exp_name=train_expert_LNG1 targets="LNG1" hparams=" --save-output --share-all-embeddings" bash runs/train.sh.
Replace LNG1 with other languages to train all the experts(LNG2, LNG3, ...).
Topk output binary files will be produced after steps 1 and 2 in $data/data-bin

Run exp_name=train_kd_multilingual targets="LNG1,LNG2 ...(filling with all languages)" hparams=" --share-all-embeddings" bash runs/train_distill.sh to train the KD multilingual model. BLEU scores will be printed to console every 3 epochs.

Name		Name	Last commit message	Last commit date
Latest commit History 440 Commits
data/iwslt/raw		data/iwslt/raw
docs		docs
examples		examples
fairseq		fairseq
runs		runs
scripts		scripts
tests		tests
.flake8		.flake8
.gitignore		.gitignore
.python3		.python3
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PATENTS		PATENTS
README.md		README.md
distributed_train.py		distributed_train.py
eval_lm.py		eval_lm.py
fairseq.gif		fairseq.gif
generate.py		generate.py
interactive.py		interactive.py
multiprocessing_train.py		multiprocessing_train.py
preprocess.py		preprocess.py
preprocess_universal.py		preprocess_universal.py
requirements.txt		requirements.txt
score.py		score.py
setup.py		setup.py
train.py		train.py