2018
INTERSPEECH
INTERSPEECH 2018
Extracting Speaker’s Gender, Accent, Age and Emotional State from Speech
Abstract
We demonstrate a speaker characteristics assessment solution to extract speaker’s information like gender, age, emotion, language and accent from telephone quality speech. The solution has been designed using machine learning algorithms ranging from Gaussian mixture models to deep neural networks and utilize websocket technology for real-time bidirectional interface to provide live updates in a scalable manner. The service is utilized on our demonstration web-page where user can upload or record audio file and obtain the speaker’s characteristics. Such speaker characteristics information can be used as metadata in many real life applications designed for an emotionally sensitive human to machine interaction and human to human interaction.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🧭
Keyword Pioneer
— speaker characteristics