2020 WACV WACV 2020

Bridged Variational Autoencoders for Joint Modeling of Images and Attributes

Abstract

Generative models have recently shown the ability to realistically generate data and model the distribution accurately. However, joint modeling of an image with the attribute that it is labeled with requires learning a cross modal correspondence between images and the attribute data. Though the information present in the images and attributes possess completely different statistical properties altogether, there exists an inherent correspondence that is challenging to capture. Various models have aimed at capturing this correspondence either through joint modeling of a variational autoencoder or through separate encoder networks that are then concatenated. We present an alternative by proposing a bridged variational autoencoder that allows for learning cross-modal correspondence by incorporating cross-modal hallucination losses in the latent space. In comparison to the existing methods, we have found that by incorporating this information into the network we not only obtain better generation results, but also obtain very distinctive latent embeddings thereby increasing the accuracy of cross-modal generated results. We validate the proposed method through comparison with state of the art methods and benchmarking on standard datasets.

🚀 Conference Pioneer — WACV 2020
🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio