2018
EMNLP
EMNLP 2018
Using Wikipedia Edits in Low Resource Grammatical Error Correction
Abstract
AbstractWe develop a grammatical error correction (GEC) system for German using a small gold GEC corpus augmented with edits extracted from Wikipedia revision history. We extend the automatic error annotation tool ERRANT (Bryant et al., 2017) for German and use it to analyze both gold GEC corrections and Wikipedia edits (Grundkiewicz and Junczys-Dowmunt, 2014) in order to select as additional training data Wikipedia edits containing grammatical corrections similar to those in the gold corpus. Using a multilayer convolutional encoder-decoder neural network GEC approach (Chollampatt and Ng, 2018), we evaluate the contribution of Wikipedia edits and find that carefully selected Wikipedia edits increase performance by over 5%.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— error annotation
🐣
Hot Topic Early Bird
— grammatical error correction
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Deep Learning > Architectures > Transformers
Natural Language Processing > Generation > Text Generation
Natural Language Processing > Applications > Text Classification
Interdisciplinary > Linguistics > Computational Linguistics
Machine Learning > Learning Types > Transfer Learning
Deep Learning > Learning Types > Deep Learning
Artificial Intelligence > Core AI > Natural Language Processing