Improving Proficiency and Grammar Accuracy for Chinese Language Learners with Large Language Models

Yuqi Liang; Wenjing Xu; Hongzhi Xu

2025 AACL AACL 2025

Improving Proficiency and Grammar Accuracy for Chinese Language Learners with Large Language Models

Abstract

AbstractIn this study, we evaluate the performance of large language models (LLMs) in detecting and correcting grammatical errors made by Chinese language learners. We find that incorporating various linguistic features—such as dependency structures, parts of speech, and pinyin transliteration—into the prompts can potentially enhance model performance. Among these features, parts of speech and pinyin prove to be the most effective across all tested models. Additionally, our findings show that the success of error correction also depends on the severity of the errors. When the intended meaning is preserved, LLMs tend to provide accurate revisions following the principle of minimal editing. However, when the meaning is obscured, LLMs are more likely to produce divergent outputs, both in comparison to reference corrections and to the responses of other models.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — pinyin transliteration

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yuqi Liang , Wenjing Xu , Hongzhi Xu

Topics

Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Understanding > Syntax Natural Language Processing > Resources & Methods > Large Language Models

Keywords

grammatical error correction dependency parsing part-of-speech tagging large language model pinyin transliteration

Download PDF

Related papers

Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge 2025

Counterfactual Evaluation for Blind Attack Detection in LLM-based Evaluation Systems 2025

Enhancing Training Data Quality through Influence Scores for Generalizable Classification: A Case Study on Sexism Detection 2025

CtrlShift: Steering Language Models for Dense Quotation Retrieval with Dynamic Prompts 2025

A Diagnostic Framework for Auditing Reference-Free Vision-Language Metrics 2025