Deep Autoencoder Based Speech Features for Improved Dysarthric Speech Recognition

Bhavik Vachhani; Chitralekha Bhat; Biswajit Das; Sunil Kumar Kopparapu

2017 INTERSPEECH INTERSPEECH 2017

Deep Autoencoder Based Speech Features for Improved Dysarthric Speech Recognition

Abstract

Dysarthria is a motor speech disorder, resulting in mumbled, slurred or slow speech that is generally difficult to understand by both humans and machines. Traditional Automatic Speech Recognizers (ASR) perform poorly on dysarthric speech recognition tasks. In this paper, we propose the use of deep autoencoders to enhance the Mel Frequency Cepstral Coefficients (MFCC) based features in order to improve dysarthric speech recognition. Speech from healthy control speakers is used to train an autoencoder which is in turn used to obtain improved feature representation for dysarthric speech. Additionally, we analyze the use of severity based tempo adaptation followed by autoencoder based speech feature enhancement. All evaluations were carried out on Universal Access dysarthric speech corpus. An overall absolute improvement of 16% was achieved using tempo adaptation followed by autoencoder based speech front end representation for DNN-HMM based dysarthric speech recognition.

🌉 Interdisciplinary Bridge — Deep Learning and Speech & Audio

🧭 Keyword Pioneer — speech feature enhancement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Bhavik Vachhani , Chitralekha Bhat , Biswajit Das , Sunil Kumar Kopparapu

Topics

Deep Learning > Architectures > Autoencoders Speech & Audio > Recognition > Automatic Speech Recognition Speech & Audio > Recognition > Speech Recognition

Keywords

feature representation feature enhancement mel frequency cepstral coefficient dysarthric speech recognition speech feature speech feature enhancement

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017