2026 EACL EACL 2026

Machine Translation for Low Resource Turkic Languages: English-Tatar

Abstract

AbstractThis paper outlines our winning submission to the English-to-Tatar translation task. We evaluated three strategies: few-shot prompting with Gemini 3 Pro Preview, specialized trans-tokenized Tweeties models, and the RL-distilled TranslateGemma family. Results demonstrate that large commercial models significantly outperform smaller specialized ones in this low-resource setting. Gemini secured first place with a chrF++ score of 56.71, surpassing the open-source baseline of 25.23.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors