Advances in Auto-Grading with Large Language Models: A Cross-Disciplinary Survey

Tania Amanda Nkoyo Frederick Eneye; Chukwuebuka Fortunate Ijezue; Ahmad Imam Amjad; Maaz Amjad; Sabur Butt; Gerardo Castañeda-Garza

2025 ACL ACL 2025

Advances in Auto-Grading with Large Language Models: A Cross-Disciplinary Survey

Abstract

AbstractWith the rise and widespread adoption of Large Language Models (LLMs) in recent years, extensive research has been conducted on their applications across various domains. One such domain is education, where a key area of interest for researchers is investigating the implementation and reliability of LLMs in grading student responses. This review paper examines studies on the use of LLMs in grading across six academic sub-fields: educational assessment, essay grading, natural sciences and technology, social sciences and humanities, computer science and engineering, and mathematics. It explores how different LLMs are applied in automated grading, the prompting techniques employed, the effectiveness of LLM-based grading for both structured and open-ended responses, and the patterns observed in grading performance. Additionally, this paper discusses the challenges associated with LLM-based grading systems, such as inconsistencies and the need for human oversight. By synthesizing existing research, this paper provides insights into the current capabilities of LLMs in academic assessment and serves as a foundation for future exploration in this area.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — academic assessment

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio