2019 ICCV ICCV 2019

Hiding Video in Audio via Reversible Generative Models

Abstract

We present a method for hiding video content inside audio files while preserving the perceptual fidelity of the cover audio. This is a form of cross-modal steganography and is particularly challenging due to the high bitrate of video. Our scheme uses recent advances in flow-based generative models, which enable mapping audio to latent codes such that nearby codes correspond to perceptually similar signals. We show that compressed video data can be concealed in the latent codes of audio sequences while preserving the fidelity of both the hidden video and the cover audio. We can embed 128x128 video inside same-duration audio, or higher-resolution video inside longer audio sequences. Quantitative experiments show that our approach outperforms relevant baselines in steganographic capacity and fidelity.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning
🧭 Keyword Pioneer — audio watermark
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Machine Learning, Natural Language Processing, Speech & Audio