2017 INTERSPEECH INTERSPEECH 2017

Indoor/Outdoor Audio Classification Using Foreground Speech Segmentation

Abstract

The task of indoor/ outdoor audio classification using foreground speech segmentation is attempted in this work. Foreground speech segmentation is the use of features to segment between foreground speech and background interfering sources like noise. Initially, the foreground and background segments are obtained from foreground speech segmentation by using the normalized autocorrelation peak strength (NAPS) of the zero frequency filtered signal (ZFFS) as a feature. The background segments are then considered for determining whether a particular segment is an indoor or outdoor audio sample. The mel frequency cepstral coefficients are obtained from the background segments of both the indoor and outdoor audio samples and are used to train the Support Vector Machine (SVM) classifier. The use of foreground speech segmentation gives a promising performance for the indoor/ outdoor audio classification task.

🧭 Keyword Pioneer — foreground speech segmentation
🐣 Hot Topic Early Bird — audio classification
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio