2007
NIPS
NeurIPS 2007
A probabilistic model for generating realistic lip movements from speech
Abstract
The present work aims to model the correspondence between facial motion and speech. The face and sound are modelled separately, with phonemes being the link between both. We propose a sequential model and evaluate its suitability for the generation of the facial animation from a sequence of phonemes, which we obtain from speech. We evaluate the results both by computing the error between generated sequences and real video, as well as with a rigorous double-blind test with human subjects. Experiments show that our model compares favourably to other existing methods and that the sequences generated are comparable to real video sequences.
🌱
Topic Pioneer
— Speech Enhancement
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Interdisciplinary and Speech & Audio
📈
Trend Setter
— Multimodal Learning
🧭
Keyword Pioneer
— facial animation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Speech & Audio
🐣
Hot Topic Early Bird
— speech synthesis
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Computer Vision > Generation > Image Generation
Computer Vision > Generation > Video Generation
Speech & Audio > Synthesis > Speech Enhancement
Interdisciplinary > Social > Affective Computing
Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling
Machine Learning > Learning Types > Multi-Modal Learning
Speech & Audio > Synthesis > Speech Synthesis