2020 INTERSPEECH INTERSPEECH 2020

Raw Speech Waveform Based Classification of Patients with ALS, Parkinson’s Disease and Healthy Controls Using CNN-BLSTM

Abstract

Analysis of speech waveform through automated methods in patients with Amyotrophic Lateral Sclerosis (ALS), and Parkinson’s disease (PD) can be used for early diagnosis and monitoring disease progression. Many works in the past have used different acoustic features for the classification of patients with ALS and PD with healthy controls (HC). In this work, we propose a data-driven approach to learn representations from raw speech waveform. Our model comprises of 1-D CNN layer to extract representations from raw speech followed by BLSTM layers for the classification tasks. We consider 3 different classification tasks (ALS vs HC), (PD vs HC), and (ALS vs PD). We perform each classification task using four different speech stimuli in two scenarios: i) trained and tested in a stimulus-specific manner, ii) trained on data pooled from all stimuli, and test on each stimulus separately. Experiments with 60 ALS, 60 PD, and 60 HC show that the frequency responses of the learned 1-D CNN filters are low pass in nature, and the center frequencies lie below 1kHz. The learned representations form raw speech perform better than MFCC which is considered as baseline. Experiments with pooled models yield a better result compared to the task-specific models.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio