NLIP at BEA 2025 Shared Task: Evaluation of Pedagogical Ability of AI Tutors

Trishita Saha; Shrenik Ganguli; Maunendra Sankar Desarkar

2025 ACL ACL 2025

NLIP at BEA 2025 Shared Task: Evaluation of Pedagogical Ability of AI Tutors

Abstract

AbstractThis paper describes the system created for the BEA 2025 Shared Task on Pedagogical Ability Assessment of AI-powered Tutors. The task aims to assess how well AI tutors identify and locate errors made by students, provide guidance and ensure actionability, among other features of their responses in educational dialogues. Transformer-based models, especially DeBERTa and RoBERTa, are improved by multitask learning, threshold tweaking, ordinal regression, and oversampling. The efficiency of pedagogically driven training methods and bespoke transformer models for evaluating AI tutor quality is demonstrated by the high performance of their best systems across all evaluation tracks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio