2019
ACL
ACL 2019
Noisy Channel for Low Resource Grammatical Error Correction
Abstract
AbstractThis paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google’s BERT model, which we fine-tune for specific error types and 2) OpenAI’s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— noisy channel model
🐣
Hot Topic Early Bird
— language model
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Natural Language Processing > Generation > Language Modeling
Natural Language Processing > Generation > Text Generation
Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling
Natural Language Processing > Resources & Methods > Language Modeling
Natural Language Processing > Applications > Text Processing
Deep Learning > Learning Types > Sequence Modeling