Extracting Speaker’s Gender, Accent, Age and Emotional State from Speech

Nagendra Goel; Mousmita Sarma; Tejendra Kushwah; Dharmesh Agarwal; Zikra Iqbal; Surbhi Chauhan

2018 INTERSPEECH INTERSPEECH 2018

Extracting Speaker’s Gender, Accent, Age and Emotional State from Speech

Abstract

We demonstrate a speaker characteristics assessment solution to extract speaker’s information like gender, age, emotion, language and accent from telephone quality speech. The solution has been designed using machine learning algorithms ranging from Gaussian mixture models to deep neural networks and utilize websocket technology for real-time bidirectional interface to provide live updates in a scalable manner. The service is utilized on our demonstration web-page where user can upload or record audio file and obtain the speaker’s characteristics. Such speaker characteristics information can be used as metadata in many real life applications designed for an emotionally sensitive human to machine interaction and human to human interaction.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🧭 Keyword Pioneer — speaker characteristics

Authors

Nagendra Goel , Mousmita Sarma , Tejendra Kushwah , Dharmesh Agarwal , Zikra Iqbal , Surbhi Chauhan

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Core Methods > Classification Speech & Audio > Analysis > Speaker Verification Machine Learning > Learning Types > Representation Learning Speech & Audio > Analysis > Speech Analysis

Keywords

emotion recognition speech analysis speaker recognition gaussian mixture model deep neural network accent recognition speaker characteristics

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018