Secoco: Self-Correcting Encoding for Neural Machine Translation

Tao Wang; Chengqi Zhao; Mingxuan Wang; Lei Li; Hang Li; Deyi Xiong

2021 EMNLP EMNLP 2021

Secoco: Self-Correcting Encoding for Neural Machine Translation

Abstract

AbstractThis paper presents Self-correcting Encoding (Secoco), a framework that effectively deals with noisy input for robust neural machine translation by introducing self-correcting predictors. Different from previous robust approaches, Secoco enables NMT to explicitly correct noisy inputs and delete specific errors simultaneously with the translation decoding process. Secoco is able to achieve significant improvements over strong baselines on two real-world test sets and a benchmark WMT dataset with good interpretability. We will make our code and dataset publicly available soon.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — self-correcting encoding

🐣 Hot Topic Early Bird — error correction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Tao Wang , Chengqi Zhao , Mingxuan Wang , Lei Li , Hang Li , Deyi Xiong

Topics

Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation Deep Learning > Learning Types > Representation Learning Machine Learning > Learning Types > Robustness

Keywords

neural machine translation error correction noisy input robust translation self-correcting encoding noisy input handling

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021