Statistical Machine Translation
Testing Machine Translation of Idioms and Proper Names
Pages
8
Time to read
25 mins
Publication
Language
English
Pages
8
Time to read
25 mins
Publication
Language
English
This research article documents the submission of the Árni Magnússon Institute’s team to the WMT24 test suite subtask, which focuses on the challenges posed by idiomatic expressions and proper names in the English to Icelandic translation direction. The study outlines the creation of two distinct test suites: one for evaluating the translation of common English idioms and their literal counterparts, and another for assessing the translation of place names into their Icelandic exonyms. The results indicate that current machine translation systems struggle significantly with these categories, as reflected in the low scores achieved. The paper emphasizes the importance of these challenges in exposing the limitations of modern translation models, particularly those utilizing Large Language Models (LLMs). The authors aim to provide a foundation for further research and model comparison in this area by releasing their test suites and evaluation code for broader use.