An Evaluation of Data Augmentation Methods for Sound Scene Geotagging

Helen L. Bear; Veronica Morfi; Emmanouil Benetos

2021 INTERSPEECH INTERSPEECH 2021

An Evaluation of Data Augmentation Methods for Sound Scene Geotagging

Abstract

Sound scene geotagging is a new topic of research which has evolved from acoustic scene classification. It is motivated by the idea of audio surveillance. Not content with only describing a scene in a recording, a machine which can locate where the recording was captured would be of use to many. In this paper we explore a series of common audio data augmentation methods to evaluate which best improves the accuracy of audio geotagging classifiers. Our work improves on the state-of-the-art city geotagging method by 23% in terms of classification accuracy.

🧭 Keyword Pioneer — sound scene geotagging

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

Authors

Helen L. Bear , Veronica Morfi , Emmanouil Benetos

Topics

Machine Learning > Core Methods > Classification Machine Learning > Application Areas > Data Augmentation Speech & Audio > Analysis > Speech Analysis

Keywords

data augmentation audio classification acoustic scene classification acoustic scene sound scene geotagging audio surveillance audio geotagging

Download PDF

Related papers

Energy-Friendly Keyword Spotting System Using Add-Based Convolution 2021

Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information 2021

Using Games to Augment Corpora for Language Recognition and Confusability 2021

A Psychology-Driven Computational Analysis of Political Interviews 2021

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results 2021