Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features

Jiarui Wang; Ying Qin; Zhiyuan Peng; Tan Lee

2019 INTERSPEECH INTERSPEECH 2019

Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features

Abstract

Acoustics-based automatic assessment is a highly desirable approach to detecting speech sound disorder (SSD) in children. The performance of an automatic speech assessment system depends greatly on the availability of a good amount of properly annotated disordered speech, which is a critical problem particularly for child speech. This paper presents a novel design of child speech disorder detection system that requires only normal speech for model training. The system is based on a Siamese recurrent network, which is trained to learn the similarity and discrepancy of pronunciations between a pair of phones in the embedding space. For detection of speech sound disorder, the trained network measures a distance that contrasts the test phone to the desired phone and the distance is used to train a binary classifier. Speech attribute features are incorporated to measure the pronunciation quality and provide diagnostic feedback. Experimental results show that Siamese recurrent network with a combination of speech attribute features and phone posterior features could attain an optimal detection accuracy of 0.941.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

🧭 Keyword Pioneer — speech disorder detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Jiarui Wang , Ying Qin , Zhiyuan Peng , Tan Lee

Topics

Machine Learning > Core Methods > Metric Learning Deep Learning > Architectures > Neural Networks Speech & Audio > Analysis > Clinical Speech Analysis Healthcare & Medicine > Clinical > Medical AI

Keywords

metric learning speaker verification recurrent neural network siamese network child speech speech disorder detection siamese recurrent network speech attribute feature pronunciation quality

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019