2019
ACL
ACL 2019
Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing
Abstract
AbstractIn this paper, we propose an approach for semi-automatically creating a data-to-text (D2T) corpus for Russian that can be used to learn a D2T natural language generation model. An error analysis of the output of an English-to-Russian neural machine translation system shows that 80% of the automatically translated sentences contain an error and that 53% of all translation errors bear on named entities (NE). We therefore focus on named entities and introduce two post-editing techniques for correcting wrongly translated NEs.
🧭
Keyword Pioneer
— corpus creation
🐝
Cross-Pollinator
— Artificial Intelligence, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio
🌉
Interdisciplinary Bridge
— Deep Learning and Natural Language Processing
Authors
Topics
Natural Language Processing > Generation > Text Generation
Natural Language Processing > Applications > Machine Translation
Natural Language Processing > Resources & Methods > Text Representation
Natural Language Processing > Generation > Machine Translation
Deep Learning > Learning Types > Generative Models