Skip to content
/ regemt Public

Regressive ensemble for machine translation evaluation

Notifications You must be signed in to change notification settings

MIR-MU/regemt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RegEMT: Regressive ensemble for machine translation evaluation

Test and publish

The master branch contains sources for reproducing our results reported in the WMT21 Metrics workshop.

See ablation-study for evaluating an impact of each of the ensembled metrics to the result, xling for zero-shot cross-lingual metric evaluation, multiling for evaluation of the fit on multiple languages, test_judgements for re-generating the submission, and docker-build for building a Docker image.

How to reproduce our results

Docker

To reproduce our results, you can use our miratmu/regemt Docker image using the NVIDIA Container Toolkit:

mkdir submit_dir
chmod 777 submit_dir

# test the installation on a data subsample before running the full evaluation process:
docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt --fast

# simply run the evaluation on the full data sets:
# this takes ~10hrs on Tesla T4, might take longer on CPU
docker run --rm --gpus all -v "$PWD"/submit_dir:/submit_dir miratmu/regemt

The evaluation process will generate the correlation reports in .png and .pdf format for each of the evaluated configurations into the submit_dir/ directory.

Python

Alternatively, you can install our package using Python:

git clone https://github.com/MIR-MU/regemt.git
cd regemt
chmod 777 submit_dir

# install the dependencies
conda create --name wmt_eval python=3.8
conda activate wmt_eval
pip install -r requirements.txt

# test the installation on a data subsample before running the full evaluation process:
python -m main --fast

# simply run the evaluation on the full data sets:
# this takes ~10hrs on Tesla T4, might take longer on CPU
python -m main

The evaluation process will generate the correlation reports in .png and .pdf format for each of the evaluated configurations into the regemt/ directory.


We're trying to keep it simple, but if you get into any trouble, or have a question, don't hesitate to create an issue and we'll take a look!

Citing RegEmt

Text

ŠTEFÁNIK, Michal, Vít NOVOTNÝ and Petr SOJKA. Regressive Ensemble for Machine Translation Quality Evaluation. In Markus Freitag. Proceedings of EMNLP 2021 Sixth Conference on Machine Translation (WMT 21). ACL, 2021. 8 pp.

BibTeX

@inproceedings{stefanik2021regressive,
  author = {\v{S}tef\'{a}nik, Michal and Novotn\'{y}, V\'{i}t and Sojka, Petr},
  title = {Regressive Ensemble for Machine Translation Quality Evaluation},
  booktitle = {Proceedings of {EMNLP} 2021 Sixth Conference on Machine Translation ({WMT} 21)},
  editor = {Markus Freitag},
  publisher = {ACL},
  numpages = {8},
  url = {https://arxiv.org/abs/2109.07242v1},
}

About

Regressive ensemble for machine translation evaluation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published