A Chain-of-Task Framework for Instruction Tuning of LLMs Based on Chinese Grammatical Error Correction

Xinpeng Liu; Bing Xu; Muyun Yang; Hailong Cao; Conghui Zhu; Tiejun Zhao; Wenpeng Lu

2025 COLING COLING 2025

A Chain-of-Task Framework for Instruction Tuning of LLMs Based on Chinese Grammatical Error Correction

Abstract

AbstractOver-correction is a critical issue for large language models (LLMs) to address Grammatical Error Correction (GEC) task, esp. for Chinese. This paper proposes a Chain-of-Task (CoTask) framework to reduce over-correction. The CoTask framework is applied as multi-task instruction tuning of LLMs by decomposing the process of grammatical error analysis to design auxiliary tasks and adjusting the types and combinations of training tasks. A supervised fine-tuning (SFT) strategy is also presented to enhance the performance of LLMs, together with an algorithm for automatic dataset annotation to avoid additional manual costs. Experimental results demonstrate that our method achieves new state-of-the-art results on both FCGEC (in-domain) and NaCGEC (out-of-domain) test sets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xinpeng Liu , Bing Xu , Muyun Yang , Hailong Cao , Conghui Zhu , Tiejun Zhao , Wenpeng Lu

Topics

Artificial Intelligence > Core AI > Multi-Agent Systems Artificial Intelligence > Learning Paradigms > Transfer Learning Natural Language Processing > Understanding > Syntax

Keywords

multi-task learning grammatical error correction instruction tuning chinese language large language model

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025