2019
ACL
ACL 2019
Dynamically Composing Domain-Data Selection with Clean-Data Selection by “Co-Curricular Learning” for Neural Machine Translation
Abstract
AbstractNoise and domain are important aspects of data quality for neural machine translation. Existing research focus separately on domain-data selection, clean-data selection, or their static combination, leaving the dynamic interaction across them not explicitly examined. This paper introduces a “co-curricular learning” method to compose dynamic domain-data selection with dynamic clean-data selection, for transfer learning across both capabilities. We apply an EM-style optimization procedure to further refine the “co-curriculum”. Experiment results and analysis with two domains demonstrate the effectiveness of the method and the properties of data scheduled by the co-curriculum.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— co-curricular learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
📈
Trend Setter
— Curriculum Learning
🐣
Hot Topic Early Bird
— data selection