2025 ACL ACL 2025

Zero at SemEval-2025 Task 2: Entity-Aware Machine Translation: Fine-Tuning NLLB for Improved Named Entity Translation

Abstract

AbstractMachine Translation (MT) is an essential tool for communication amongst people across different cultures, yet Named Entity (NE) translation remains a major challenge due to its rarity in occurrence and ambiguity. Traditional approaches, like using lexicons or parallel corpora, often fail to generalize to unseen entities, and hence do not perform well. To address this, we create a silver dataset using the Google Translate API and fine-tune the facebook/nllb200-distilled-600M model with LoRA (LowRank Adaptation) to enhance translation accuracy while also maintaining efficient memory use. Evaluated with metrics such as BLEU, COMET, and M-ETA, our results show that fine-tuning a specialized MT model improves NE translation without having to rely on largescale general-purpose models.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio