2017 INTERSPEECH INTERSPEECH 2017

Phoneme-Discriminative Features for Dysarthric Speech Conversion

Abstract

We present in this paper a Voice Conversion (VC) method for a person with dysarthria resulting from athetoid cerebral palsy. VC is being widely researched in the field of speech processing because of increased interest in using such processing in applications such as personalized Text-To-Speech systems. A Gaussian Mixture Model (GMM)-based VC method has been widely researched and Partial Least Square (PLS)-based VC has been proposed to prevent the over-fitting problems associated with the GMM-based VC method. In this paper, we present phoneme-discriminative features, which are associated with PLS-based VC. Conventional VC methods do not consider the phonetic structure of spectral features although phonetic structures are important for speech analysis. Especially for dysarthric speech, their phonetic structures are difficult to discriminate and discriminative learning will improve the conversion accuracy. This paper employs discriminative manifold learning. Spectral features are projected into a subspace in which a near point with the same phoneme label is close to another and a near point with a different phoneme label is apart. Our proposed method was evaluated on dysarthric speaker conversion task which converts dysarthric voice into non-dysarthric speech.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🧭 Keyword Pioneer — phoneme discriminative feature
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio