2019 AAAI AAAI 2019

Realtime Generation of Audible Textures Inspired by a Video Stream

Abstract

Abstract We showcase a model to generate a soundscape from a camera stream in real time. The approach relies on a training video with an associated meaningful audio track; a granular synthesizer generates a novel sound by randomly sampling and mixing audio data from such video, favoring timestamps whose frame is similar to the current camera frame; the semantic similarity between frames is computed by a pretrained neural network. The demo is interactive: a user points a mobile phone to different objects and hears how the generated sound changes.

🚀 Conference Pioneer — AAAI 2019
🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning and Speech & Audio
🧭 Keyword Pioneer — pretrained neural network
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio