Two-Headed Monster and Crossed Co-Attention Networks

Yaoyiran Li; Jing Jiang

2020 AACL AACL 2020

Two-Headed Monster and Crossed Co-Attention Networks

Abstract

AbstractThis paper investigates a new co-attention mechanism in neural transduction models for machine translation tasks. We propose a paradigm, termed Two-Headed Monster (THM), which consists of two symmetric encoder modules and one decoder module connected with co-attention. As a specific and concrete implementation of THM, Crossed Co-Attention Networks (CCNs) are designed based on the Transformer model. We test CCNs on WMT 2014 EN-DE and WMT 2016 EN-FI translation tasks and show both advantages and disadvantages of the proposed method. Our model outperforms the strong Transformer baseline by 0.51 (big) and 0.74 (base) BLEU points on EN-DE and by 0.17 (big) and 0.47 (base) BLEU points on EN-FI but the epoch time increases by circa 75%.

🚀 Conference Pioneer — AACL 2020

🌉 Interdisciplinary Bridge — Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — co-attention mechanism

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

Authors

Yaoyiran Li , Jing Jiang

Topics

Deep Learning > Architectures > Transformers Natural Language Processing > Generation > Machine Translation

Keywords

machine translation co-attention mechanism neural transduction

Download PDF

Related papers

Can Monolingual Pretrained Models Help Cross-Lingual Classification? 2020

Text Simplification with Reinforcement Learning Using Supervised Rewards on Grammaticality, Meaning Preservation, and Simplicity 2020

ISA: An Intelligent Shopping Assistant 2020

Social Media Medical Concept Normalization using RoBERTa in Ontology Enriched Text Similarity Framework 2020

Overcoming Resistance: The Normalization of an Amazonian Tribal Language 2020