Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM

Yow-Ting Shiue; Hen-Hsen Huang; Hsin-Hsi Chen

2017 ACL ACL 2017

Detection of Chinese Word Usage Errors for Non-Native Chinese Learners with Bidirectional LSTM

Abstract

AbstractSelecting appropriate words to compose a sentence is one common problem faced by non-native Chinese learners. In this paper, we propose (bidirectional) LSTM sequence labeling models and explore various features to detect word usage errors in Chinese sentences. By combining CWINDOW word embedding features and POS information, the best bidirectional LSTM model achieves accuracy 0.5138 and MRR 0.6789 on the HSK dataset. For 80.79% of the test data, the model ranks the ground-truth within the top two at position level.

🧭 Keyword Pioneer — word usage error detection

🐣 Hot Topic Early Bird — error correction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Yow-Ting Shiue , Hen-Hsen Huang , Hsin-Hsi Chen

Topics

Natural Language Processing > Applications > Text Classification

Keywords

sequence labeling part-of-speech tagging error correction bidirectional lstm word usage error detection

Download PDF

Related papers

A* CCG Parsing with a Supertag and Dependency Factored Model 2017

Detecting annotation noise in automatically labelled data 2017

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2017

Annotating tense, mood and voice for English, French and German 2017

Word Embedding for Response-To-Text Assessment of Evidence 2017