2026 WACV WACV 2026

Ordinal-Aware Multimodal Engagement Recognition for Collaborative Learning

Abstract

Assessing student engagement is critical for collaborative learning but remains a challenging task. Existing approaches often rely on controlled laboratory or online settings, which fail to capture the complexity of real-world classrooms. Furthermore, current datasets are scarce and rarely provide both individual- and group-level annotations, limiting the development of robust and generalizable models. To address these gaps, we propose CORE-Net, a multimodal architecture that integrates context modeling to capture group-level dynamics and ordinal supervision to account for the ordinal nature of engagement levels. We also present COLER, a large-scale dataset collected in authentic classroom environments with rich annotations at multiple levels. Experiments demonstrate that CORE- Net achieves 89.63% accuracy and 94.80 QWK, significantly outperforming strong baselines such as BlockGCN and MoViNet. Ablation studies further highlight the critical role of both context modeling and ordinal supervision. Our work establishes a robust and scalable foundation for automated engagement assessment, supporting timely feedback and enhancing the effectiveness of collaborative learning.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Interdisciplinary and Machine Learning
🧭 Keyword Pioneer — ordinal supervision
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio