Towards Explainable Chinese Native Learner Essay Fluency Assessment: Dataset, Tasks, and Method

Xinshu Shen; Hongyi Wu; Yadong Zhang; Man Lan; Xiaopeng Bai; Shaoguang Mao; Yuanbin Wu; Xinlin Zhuang; Li Cai

2024 EMNLP EMNLP 2024

Towards Explainable Chinese Native Learner Essay Fluency Assessment: Dataset, Tasks, and Method

Abstract

AbstractGrammatical Error Correction (GEC) is a crucial technique in Automated Essay Scoring (AES) for evaluating the fluency of essays. However, in Chinese, existing GEC datasets often fail to consider the importance of specific grammatical error types within compositional scenarios, lack research on data collected from native Chinese speakers, and largely overlook cross-sentence grammatical errors. Furthermore, the measurement of the overall fluency of an essay is often overlooked. To address these issues, we present CEFA (Chinese Essay Fluency Assessment), an extensive corpus that is derived from essays authored by native Chinese-speaking primary and secondary students and encapsulates essay fluency scores along with both coarse and fine-grained grammatical error types and corrections. Experiments employing various benchmark models on CEFA substantiate the challenge of our dataset. Our findings further highlight the significance of fine-grained annotations in fluency assessment and the mutually beneficial relationship between error types and corrections

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — essay fluency

🐣 Hot Topic Early Bird — error detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xinshu Shen , Hongyi Wu , Yadong Zhang , Man Lan , Xiaopeng Bai , Shaoguang Mao , Yuanbin Wu , Xinlin Zhuang , Li Cai

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Applications > Text Classification Interdisciplinary > Linguistics > Computational Linguistics Interdisciplinary > Social > Education Interdisciplinary > Education Artificial Intelligence > Core AI > Natural Language Processing Natural Language Processing > Applications > Grammatical Error Correction

Keywords

natural language processing text classification grammatical error correction error detection automated essay scoring chinese language processing fluency assessment chinese language learning essay fluency error type essay fluency assessment fine-grained error annotation

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024