2017
INTERSPEECH
INTERSPEECH 2017
An Auditory Model of Speaker Size Perception for Voiced Speech Sounds
Abstract
An auditory model was developed to explain the results of behavioral experiments on perception of speaker size with voiced speech sounds. It is based on the dynamic, compressive gammachirp (dcGC) filterbank and a weighting function (SSI weight) derived from a theory of size-shape segregation in the auditory system. Voiced words with and without high-frequency emphasis (+6 dB/octave) were produced using a speech vocoder (STRAIGHT). The SSI weighting function reduces the effect of glottal pulse excitation in voiced speech, which, in turn, makes it possible for the model to explain the individual subject variability in the data.
🧭
Keyword Pioneer
— speech vocoder
🐝
Cross-Pollinator
— Artificial Intelligence, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Speech & Audio