Predicting Initial Essay Quality Scores to Increase the Efficiency of Comparative Judgment Assessments

Michiel De Vrindt; Anaïs Tack; Renske Bouwer; Wim Van Den Noortgate; Marije Lesterhuis

2024 NAACL NAACL 2024

Predicting Initial Essay Quality Scores to Increase the Efficiency of Comparative Judgment Assessments

Abstract

AbstractComparative judgment (CJ) is a method that can be used to assess the writing quality of student essays based on repeated pairwise comparisons by multiple assessors. Although the assessment method is known to have high validity and reliability, it can be particularly inefficient, as assessors must make many judgments before the scores become reliable. Prior research has investigated methods to improve the efficiency of CJ, yet these methods introduce additional challenges, notably stemming from the initial lack of information at the start of the assessment, which is known as a cold-start problem. This paper reports on a study in which we predict the initial quality scores of essays to establish a warm start for CJ. To achieve this, we construct informative prior distributions for the quality scores based on the predicted initial quality scores. Through simulation studies, we demonstrate that our approach increases the efficiency of CJ: On average, assessors need to make 30% fewer judgments for each essay to reach an overall reliability level of 0.70.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Michiel De Vrindt , Anaïs Tack , Renske Bouwer , Wim Van Den Noortgate , Marije Lesterhuis

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Machine Learning > Optimization & Theory > Bayesian Inference

Keywords

bayesian inference prior distribution cold-start problem essay scoring comparative judgment

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024