Directional Audio Rendering Using a Neural Network Based Personalized HRTF

Geon Woo Lee; Jung Hyuk Lee; Seong Ju Kim; Hong Kook Kim

2019 INTERSPEECH INTERSPEECH 2019

Directional Audio Rendering Using a Neural Network Based Personalized HRTF

Abstract

Multi-channel speech/audio separation and enhancement methods are popularly used for many speech/audio related applications. However, these methods may cause a loss of spatial cues, including the interaural time difference and interaural level difference, for further processing of monoaural signals. Thus, listeners may encounter difficulties in understanding the direction of the source signal. We present a directional audio renderer using a personalized HRTF, which is estimated by a neural network that combines DNN and CNN with anthropometric parameters and ear images of the listener. This demonstrated directional audio renderer concept aims to help foster research on audio processing for virtual reality/augmented reality to improve the quality of service of such devices.

🧭 Keyword Pioneer — personalized hrtf

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Geon Woo Lee , Jung Hyuk Lee , Seong Ju Kim , Hong Kook Kim

Topics

Speech & Audio > Synthesis > Speech Enhancement

Keywords

convolutional neural network deep neural network head-related transfer function personalized hrtf anthropometric parameter

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019