Dysarthric Speech Recognition Using Time-delay Neural Network Based Denoising Autoencoder

Chitralekha Bhat; Biswajit Das; Bhavik Vachhani; Sunil Kumar Kopparapu

2018 INTERSPEECH INTERSPEECH 2018

Dysarthric Speech Recognition Using Time-delay Neural Network Based Denoising Autoencoder

Abstract

Dysarthria is a manisfestation of the disruption in the neuro-muscular physiology resulting in uneven, slow, slurred, harsh or quiet speech. Dysarthric speech poses serious challenges to automatic speech recognition, considering this speech is difficult to decipher for both humans and machines. The objective of this work is to enhance dysarthric speech features to match that of healthy control speech. We use a Time-Delay Neural Network based Denoising Autoencoder (TDNN-DAE) to enhance the dysarthric speech features. The dysarthric speech thus enhanced is recognized using a DNN-HMM based Automatic Speech Recognition (ASR) engine. This methodology was evaluated for speaker-independent (SI) and speaker-adapted (SA) systems. Absolute improvements of 13% and 3% was observed in the ASR performance for SI and SA systems respectively as compared with unenhanced dysarthric speech recognition.

🐣 Hot Topic Early Bird — denoising autoencoder

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Chitralekha Bhat , Biswajit Das , Bhavik Vachhani , Sunil Kumar Kopparapu

Topics

Speech & Audio > Recognition > Speech Recognition Speech & Audio > Analysis > Clinical Speech Analysis

Keywords

automatic speech recognition denoising autoencoder feature enhancement dysarthric speech recognition time-delay neural network

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018