Domain adapted machine translation: What does catastrophic forgetting forget and why?

Danielle Saunders; Steve DeNeefe

2024 EMNLP EMNLP 2024

Domain adapted machine translation: What does catastrophic forgetting forget and why?

Abstract

AbstractNeural Machine Translation (NMT) models can be specialized by domain adaptation, often involving fine-tuning on a dataset of interest. This process risks catastrophic forgetting: rapid loss of generic translation quality. Forgetting has been widely observed, with many mitigation methods proposed. However, the causes of forgetting and the relationship between forgetting and adaptation data are underexplored.This paper takes a novel approach to understanding catastrophic forgetting during NMT adaptation by investigating the impact of the data. We provide a first investigation of what is forgotten, and why. We examine the relationship between forgetting and the in-domain data, and show that the amount and type of forgetting is linked to that data’s target vocabulary coverage. Our findings pave the way toward better informed NMT domain adaptation.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — target vocabulary

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Danielle Saunders , Steve DeNeefe

Topics

Machine Learning > Learning Types > Continual Learning Natural Language Processing > Applications > Machine Translation Artificial Intelligence > Learning Paradigms > Domain Adaptation Artificial Intelligence > Core AI > Machine Translation

Keywords

catastrophic forgetting domain adaptation neural machine translation target vocabulary

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024