An Auditory Model of Speaker Size Perception for Voiced Speech Sounds

Toshio Irino; Eri Takimoto; Toshie Matsui; Roy D. Patterson

2017 INTERSPEECH INTERSPEECH 2017

An Auditory Model of Speaker Size Perception for Voiced Speech Sounds

Abstract

An auditory model was developed to explain the results of behavioral experiments on perception of speaker size with voiced speech sounds. It is based on the dynamic, compressive gammachirp (dcGC) filterbank and a weighting function (SSI weight) derived from a theory of size-shape segregation in the auditory system. Voiced words with and without high-frequency emphasis (+6 dB/octave) were produced using a speech vocoder (STRAIGHT). The SSI weighting function reduces the effect of glottal pulse excitation in voiced speech, which, in turn, makes it possible for the model to explain the individual subject variability in the data.

🧭 Keyword Pioneer — speech vocoder

🐝 Cross-Pollinator — Artificial Intelligence, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Speech & Audio

Authors

Toshio Irino , Eri Takimoto , Toshie Matsui , Roy D. Patterson

Topics

Interdisciplinary > Linguistics > Phonetics

Keywords

speech vocoder auditory model speaker perception voiced speech acoustic filter

Download PDF

Related papers

Description of the Munich-Passau Snore Sound Corpus (MPSSC) 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification 2017

Binaural Reverberant Speech Separation Based on Deep Neural Networks 2017

Building Audio-Visual Phonetically Annotated Arabic Corpus for Expressive Text to Speech 2017

A Comparison of Danish Listeners’ Processing Cost in Judging the Truth Value of Norwegian, Swedish, and English Sentences 2017