Releases: hplt-project/bitextor-mt-models
HPLT Bitextor Models v1
This release includes the first version (v1) of fast Machine Translation (MT) models specifically designed for integration with the Bitextor pipeline. These models were developed in 2023, focusing on optimizing translation speed and efficiency for large-scale parallel corpus generation.
For more details on the underlying dataset and technologies used in this work, please refer to our paper:
Citation: de Gibert, O., Nail, G., Arefyev, N., Ba{~n}{'o}n, M., van der Linde, J., Ji, S., Zaragoza-Bernabeu, J., Aulamo, M., Ram{'\i}rez-S{'a}nchez, G., Kutuzov, A., Pyysalo, S., Oepen, S., & Tiedemann, J. (2024). "A New Massive Multilingual Dataset for High-Performance Language Technologies". In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italia: ELRA and ICCL. Link.