Can Large Language Models Differentiate Harmful from Argumentative Essays? Steps Toward Ethical Essay Scoring

Hongjin Kim; Jeonghyun Kang; Harksoo Kim

2025 COLING COLING 2025

Can Large Language Models Differentiate Harmful from Argumentative Essays? Steps Toward Ethical Essay Scoring

Abstract

AbstractThis study addresses critical gaps in Automatic Essay Scoring (AES) systems and Large Language Models (LLMs) with regard to their ability to effectively identify and score harmful essays. Despite advancements in AES technology, current models often overlook ethically and morally problematic elements within essays, erroneously assigning high scores to essays that may propagate harmful opinions. In this study, we introduce the Harmful Essay Detection (HED) benchmark, which includes essays integrating sensitive topics such as racism and gender bias, to test the efficacy of various LLMs in recognizing and scoring harmful content. Our findings reveal that: (1) LLMs require further enhancement to accurately distinguish between harmful and argumentative essays, and (2) both current AES models and LLMs fail to consider the ethical dimensions of content during scoring. The study underscores the need for developing more robust AES systems that are sensitive to the ethical implications of the content they are scoring.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — harmful essay detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Hongjin Kim , Jeonghyun Kang , Harksoo Kim

Topics

Artificial Intelligence > Core AI > Responsible AI Natural Language Processing > Applications > Text Classification

Keywords

bias detection automatic essay scoring harmful essay detection ethical essay scoring content harmfulness

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025