The master
branch contains sources for reproducing our results reported in
the WMT21 Metrics workshop.
See ablation-study
for evaluating an impact of each of the ensembled metrics
to the result, xling
for zero-shot cross-lingual metric evaluation,
multiling
for evaluation of the fit on multiple languages, test_judgements
for re-generating the submission, and docker-build
for building a Docker image.
To reproduce our results, you can use our miratmu/regemt
Docker
image using the NVIDIA Container Toolkit:
mkdir submit_dir
chmod 777 submit_dir
# test the installation on a data subsample before running the full evaluation process:
docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt --fast
# simply run the evaluation on the full data sets:
# this takes ~10hrs on Tesla T4, might take longer on CPU
docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt
The evaluation process will generate the correlation reports in .png
and
.pdf
format for each of the evaluated configurations into the submit_dir/
directory.
Alternatively, you can install our package using Python:
git clone https://github.com/MIR-MU/regemt.git
cd regemt
chmod 777 submit_dir
# install the dependencies
conda create --name wmt_eval python=3.8
conda activate wmt_eval
pip install -r requirements.txt
# test the installation on a data subsample before running the full evaluation process:
python -m main --fast
# simply run the evaluation on the full data sets:
# this takes ~10hrs on Tesla T4, might take longer on CPU
python -m main
The evaluation process will generate the correlation reports in .png
and
.pdf
format for each of the evaluated configurations into the regemt/
directory.
We're trying to keep it simple, but if you get into any trouble, or have a question, don't hesitate to create an issue and we'll take a look!
ŠTEFÁNIK, Michal, Vít NOVOTNÝ and Petr SOJKA. Regressive Ensemble for Machine Translation Quality Evaluation. In Markus Freitag. Proceedings of EMNLP 2021 Sixth Conference on Machine Translation (WMT 21). ACL, 2021. 8 pp.
@inproceedings{stefanik2021regressive,
author = {\v{S}tef\'{a}nik, Michal and Novotn\'{y}, V\'{i}t and Sojka, Petr},
title = {Regressive Ensemble for Machine Translation Quality Evaluation},
booktitle = {Proceedings of {EMNLP} 2021 Sixth Conference on Machine Translation ({WMT} 21)},
editor = {Markus Freitag},
publisher = {ACL},
numpages = {8},
url = {https://arxiv.org/abs/2109.07242v1},
}