Dysarthric Speech Recognition Using Curriculum Learning and Articulatory Feature Embedding

I-Ting Hsieh; Chung-Hsien Wu

2024 INTERSPEECH INTERSPEECH 2024

Dysarthric Speech Recognition Using Curriculum Learning and Articulatory Feature Embedding

Abstract

Recognizing speech in individuals with articulation disorders is a challenging task due to limited resources and diverse speaker characteristics. Domain adaptation is commonly employed to address these issues, and in this paper, we apply curriculum learning, a method within this approach, to Automatic Speech Recognition (ASR). To enhance the efficiency of curriculum learning, we reorganize the dataset. Additionally, we incorporate speaker and articulatory features to capture the pronunciation characteristics of patients. Experimental results demonstrate that our proposed method achieves an 11.37% improvement compared to the baseline.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — articulatory feature embedding

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

I-Ting Hsieh , Chung-Hsien Wu

Topics

Machine Learning > Learning Types > Continual Learning Speech & Audio > Recognition > Automatic Speech Recognition Speech & Audio > Analysis > Clinical Speech Analysis

Keywords

domain adaptation curriculum learning automatic speech recognition dysarthric speech recognition articulatory feature embedding

Download PDF

Related papers

Reshape Dimensions Network for Speaker Recognition 2024

RevRIR: Joint Reverberant Speech and Room Impulse Response Embedding using Contrastive Learning with Application to Room Shape Classification 2024

Mixed Children/Adult/Childrenized Fine-Tuning for Children’s ASR: How to Reduce Age Mismatch and Speaking Style Mismatch 2024

Exploring Speech Foundation Models for Speaker Diarization in Child-Adult Dyadic Interactions 2024

K-means and hierarchical clustering of f0 contours 2024