This repository contains the code for the experiments in Model-Based Minimum Bayes Risk Decoding.
The code is tested on Ubuntu 20.04 using Python 3.8 and CUDA 11.0 (Docker image nvidia/cuda:11.0.3-cudnn8-devel-ubuntu20.04).
git clone [email protected]/CyberAgentAILab/model-based-mbr
cd model-based-mbr
pip install -r requirements.txt
The code runs in two steps.
sample.sh
samples candidates.run_mbr.sh
computes the MBR and MBMBR outputs from the sampled candidates.
./experiments/sample.sh -d [DATASET] -s [NUMBER OF SAMPLES]
./experiments/run_mbr.sh -d [DATASET] -s [NUMBER OF SAMPLES]
- Use sacrebleu to prepare the benchmark dataset.
mkdir -p ./dataset/wmt19-text
sacrebleu -t wmt19 -l en-de --echo src > ./dataset/wmt19-text/wmt19.en-de.en
sacrebleu -t wmt19 -l en-de --echo ref > ./dataset/wmt19-text/wmt19.en-de.de
- Sampling sequences on WMT'19 En-De
./experiments/sample.sh -d wmt19.en-de -s 32
- Computing the MBR output on WMT'19 En-De
./experiments/run_mbr.sh -d wmt19.en-de -s 32
MBMBR is also implemented in the mbrs library and is available via pypi:
pip install mbrs
The mbrs library is maintained for running various versions of MBR decoding algorithms. It is compatible with both Huggingface's transformers and fairseq.
Bibtex:
@InProceedings{pmlr-v235-jinnai24a,
title = {Model-Based Minimum {B}ayes Risk Decoding for Text Generation},
author = {Jinnai, Yuu and Morimura, Tetsuro and Honda, Ukyo and Ariu, Kaito and Abe, Kenshi},
booktitle = {Proceedings of the 41st International Conference on Machine Learning},
pages = {22326--22347},
year = {2024},
editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
volume = {235},
series = {Proceedings of Machine Learning Research},
month = {21--27 Jul},
publisher = {PMLR},
pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/jinnai24a/jinnai24a.pdf},
url = {https://proceedings.mlr.press/v235/jinnai24a.html},
}
For any questions, feel free to raise an issue or contact me at [email protected].
MS COCO dataset is licensed under a Creative Commons BY 4.0.