非常全的文言文(古文)-现代文平行语料
-
Updated
Apr 21, 2024 - Python
非常全的文言文(古文)-现代文平行语料
data resource untuk NLP bahasa indonesia
This repository contains the code and data of the paper titled "Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation" published in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020), November 16 - November 20, 2020.
OpusFilter - Parallel corpus processing toolkit
Multilingual sentence alignment using sentence embeddings
The Business Scene Dialogue corpus
A multilingual, multi-style and multi-granularity dataset for cross-language textual similarity detection
Leeds University and King Saud University (LK) Hadith Corpus
TUFS Asian Language Parallel Corpus
Neural Machine Translation on the Nepali-English language pair
Machine translation (MT) benchmark dataset for languages in the Horn of Africa.
Caucasus languages focused multilingual and monolingual corpuses for Natural Language Processing(NLP)
Curated list of publicly available parallel corpus for Indian Languages
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
The IIT Bombay English-Hindi Parallel Corpus
Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning
OPUS (opus.nlpl.eu) Python3 API
Python application, generating parallel corpus for any language pairs, can be used for training nmt (Neural Machine Translation) systems
A corpus that can be used to train English-to-Italian End-to-End Speech-to-Text Machine Translation models
🪱 PARASITE || A parallel sentence data preprocessing toolkit. Originally developed as a part of the `en-ru` winner submission of WMT20 Biomedical Translation Task.
Add a description, image, and links to the parallel-corpus topic page so that developers can more easily learn about it.
To associate your repository with the parallel-corpus topic, visit your repo's landing page and select "manage topics."