EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos

Sourjyadip Ray; shubham sharma; Somak Aditya; Pawan Goyal

2025 EMNLP EMNLP 2025

EduVidQA: Generating and Evaluating Long-form Answers to Student Questions based on Lecture Videos

Abstract

AbstractAs digital platforms redefine educational paradigms, ensuring interactivity remains vital for effective learning. This paper explores using Multimodal Large Language Models (MLLMs) to automatically respond to student questions from online lectures - a novel question answering task of real world significance. We introduce the EduVidQA Dataset with 5252 question-answer pairs (both synthetic and real-world) from 296 computer science videos covering diverse topics and difficulty levels. To understand the needs of the dataset and task evaluation, we empirically study the qualitative preferences of students, which we provide as an important contribution to this line of work. Our benchmarking experiments consist of 6 state-of-the-art MLLMs, through which we study the effectiveness of our synthetic data for finetuning, as well as showing the challenging nature of the task. We evaluate the models using both text-based and qualitative metrics, thus showing a nuanced perspective of the models’ performance, which is paramount to future work. This work not only sets a benchmark for this important problem, but also opens exciting avenues for future research in the field of Natural Language Processing for Education.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sourjyadip Ray , shubham sharma , Somak Aditya , Pawan Goyal

Topics

Artificial Intelligence > Core AI > Multimodal Learning Natural Language Processing > Applications > Question Answering Natural Language Processing > Resources & Methods > Large Language Models Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Interdisciplinary > Education Deep Learning > Models > Multi-Modal Learning

Keywords

question answering video understanding educational technology synthetic data generation synthetic datum multimodal large language model video question answering lecture video long-form answer

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025