Statistical Machine Translation
Findings of WMT 2024 Shared Task on Low-Resource Indic Languages Translation
Pages
15
Time to read
39 mins
Publication
Language
English
Pages
15
Time to read
39 mins
Publication
Language
English
This technical report presents the results of the low-resource Indic language translation task conducted during the Ninth Conference on Machine Translation (WMT) 2024. The task involved developing machine translation models for four language pairs: English–Assamese, English–Mizo, English–Khasi, and English–Manipuri, utilizing the IndicNE-Corp1.0 dataset, which comprises a rich collection of parallel and monolingual corpora for northeastern Indic languages. The evaluation of the translation systems was carried out using a comprehensive suite of automatic metrics, including BLEU, TER, RIBES, METEOR, and ChrF, along with detailed human assessments to measure performance and accuracy. The initiative aims to advance low-resource machine translation and contribute significantly to the field, addressing the challenges faced by low-resource Indic languages, which often lack sufficient resources and institutional support. The report emphasizes the importance of continued efforts in documenting and revitalizing these languages through technological solutions.