2024 INTERSPEECH INTERSPEECH 2024

Leveraging Large Language Models to Refine Automatic Feedback Generation at Articulatory Level in Computer Aided Pronunciation Training

Abstract

This study explores the potential of leveraging Large Language Models (LLMs) to refine automatic feedback generation in Computer-Aided Pronunciation Training (CAPT). Specifically, it evaluates the impact of two factors on the effectiveness of automatically generated pronunciation feedbacks: (1) the use of mispronunciation detection at different fine-grained levels as prompts for GPT-4 models to generate automatic feedback, and (2) the fine-tuning of GPT-4 models using specific prompt-feedback pairs aimed at optimizing feedback generation. Feedback generated through each approach is rated by second language (L2) learners in terms of comprehensibility and helpfulness. The results highlight both the potential of using LLMs for automatic feedback generation and the effectiveness of articulatory level representations. Our accessible demonstrations invite further exploration.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing
🧭 Keyword Pioneer — automatic feedback generation
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio