2022
NAACL
NAACL 2022
Generation of Synthetic Error Data of Verb Order Errors for Swedish
Abstract
AbstractWe report on our work-in-progress to generate a synthetic error dataset for Swedish by replicating errors observed in the authentic error annotated dataset. We analyze a small subset of authentic errors, capture regular patterns based on parts of speech, and design a set of rules to corrupt new data. We explore the approach and identify its capabilities, advantages and limitations as a way to enrich the existing collection of error-annotated data. This work focuses on word order errors, specifically those involving the placement of finite verbs in a sentence.
🌉
Interdisciplinary Bridge
— Interdisciplinary and Mathematics & Optimization
🧭
Keyword Pioneer
— synthetic error datum
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio