Correcting Challenging Finnish Learner Texts With Claude, GPT-3.5 and GPT-4 Large Language Models

Mathias Creutz

2024 EACL EACL 2024

Correcting Challenging Finnish Learner Texts With Claude, GPT-3.5 and GPT-4 Large Language Models

Abstract

AbstractThis paper studies the correction of challenging authentic Finnish learner texts at beginner level (CEFR A1). Three state-of-the-art large language models are compared, and it is shown that GPT-4 outperforms GPT-3.5, which in turn outperforms Claude v1 on this task. Additionally, ensemble models based on classifiers combining outputs of multiple single models are evaluated. The highest accuracy for an ensemble model is 84.3%, whereas the best single model, which is a GPT-4 model, produces sentences that are fully correct 83.3% of the time. In general, the different models perform on a continuum, where grammatical correctness, fluency and coherence go hand in hand.

🌉 Interdisciplinary Bridge — Interdisciplinary and Natural Language Processing

🧭 Keyword Pioneer — learner text

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Mathias Creutz

Topics

Natural Language Processing > Generation > Text Generation Natural Language Processing > Resources & Methods > Large Language Models Interdisciplinary > Linguistics > Computational Linguistics Natural Language Processing > Applications > Text Generation

Keywords

language model ensemble model text correction learner text cefr a1 grammatical correctness large language model

Download PDF

Related papers

A Dataset for Metaphor Detection in Early Medieval Hebrew Poetry 2024

PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation 2024

Overview of the Hate Speech Detection in Turkish and Arabic Tweets (HSD-2Lang) Shared Task at CASE 2024 2024

Evaluating In-Context Learning for Computational Literary Studies: A Case Study Based on the Automatic Recognition of Knowledge Transfer in German Drama 2024

Selam@DravidianLangTech 2024:Identifying Hate Speech and Offensive Language 2024