Face Reconstruction from Voice using Generative Adversarial Networks

Yandong Wen; Bhiksha Raj; Rita Singh

2019 NIPS NeurIPS 2019

Face Reconstruction from Voice using Generative Adversarial Networks

Abstract

Voice profiling aims at inferring various human parameters from their speech, e.g. gender, age, etc. In this paper, we address the challenge posed by a subtask of voice profiling - reconstructing someone's face from their voice. The task is designed to answer the question: given an audio clip spoken by an unseen person, can we picture a face that has as many common elements, or associations as possible with the speaker, in terms of identity? To address this problem, we propose a simple but effective computational framework based on generative adversarial networks (GANs). The network learns to generate faces from voices by matching the identities of generated faces to those of the speakers, on a training set. We evaluate the performance of the network by leveraging a closely related task - cross-modal matching. The results show that our model is able to generate faces that match several biometric characteristics of the speaker, and results in matching accuracies that are much better than chance. The code is publicly available in https://github.com/cmu-mlsp/reconstructingfacesfrom_voices

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — voice profiling

🐣 Hot Topic Early Bird — face generation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yandong Wen , Bhiksha Raj , Rita Singh

Topics

Artificial Intelligence > Core AI > Multimodal Learning Machine Learning > Learning Types > Adversarial Learning Deep Learning > Models > Generative Models Computer Vision > Generation > Image Generation Deep Learning > Learning Types > Generative Models

Keywords

generative adversarial network face reconstruction cross-modal matching face generation voice profiling biometric characteristic

Download PDF

Related papers

Two Generator Game: Learning to Sample via Linear Goodness-of-Fit Test 2019

Metalearned Neural Memory 2019

Model Similarity Mitigates Test Set Overuse 2019

Continual Unsupervised Representation Learning 2019

Reinforcement Learning with Convex Constraints 2019