Single-Ended Prediction of Listening Effort Based on Automatic Speech Recognition

Rainer Huber; Constantin Spille; Bernd T. Meyer

2017 INTERSPEECH INTERSPEECH 2017

Single-Ended Prediction of Listening Effort Based on Automatic Speech Recognition

Abstract

A new, single-ended, i.e. reference-free measure for the prediction of perceived listening effort of noisy speech is presented. It is based on phoneme posterior probabilities (or posteriorgrams) obtained from a deep neural network of an automatic speech recognition system. Additive noisy or other distortions of speech tend to smear the posteriorgrams. The smearing is quantified by a performance measure, which is used as a predictor for the perceived listening effort required to understand the noisy speech. The proposed measure was evaluated using a database obtained from the subjective evaluation of noise reduction algorithms of commercial hearing aids. Listening effort ratings of processed noisy speech samples were gathered from 20 hearing-impaired subjects. Averaged subjective ratings were compared with corresponding predictions computed by the proposed new method, the ITU-T standard P.563 for single-ended speech quality assessment, the American National Standard ANIQUE+ for single-ended speech quality assessment, and a single-ended SNR estimator. The proposed method achieved a good correlation with mean subjective ratings and clearly outperformed the standard speech quality measures and the SNR estimator.

📈 Trend Setter — Clinical Speech Analysis

🧭 Keyword Pioneer — listening effort prediction

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Speech & Audio

Authors

Rainer Huber , Constantin Spille , Bernd T. Meyer

Topics

Speech & Audio > Recognition > Automatic Speech Recognition Speech & Audio > Analysis > Clinical Speech Analysis Speech & Audio > Analysis > Speech Analysis

Keywords

automatic speech recognition posterior probability speech quality assessment hearing impairment listening effort prediction phoneme posteriorgram single-ended prediction listening effort

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017