2023 INTERSPEECH INTERSPEECH 2023

Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers

Abstract

Efficient tuning of neural networks for continual learning with minimal computational resources remains a challenge. In this paper, we propose continual learning of audio classifiers with parameter and compute efficient Audio Spectrogram Transformers (AST). To reduce the trainable parameters without performance degradation we propose AST with Convolutional Adapter, which has less than 5% of trainable parameters of full fine-tuning. To reduce the computational complexity of self-attention, we introduce a novel Frequency-Time factorized Attention (FTA) method that achieves competitive performance with only a factor of the computations. Finally, we formulate our method called Adapter Incremental Continual Learning (AI-CL), as a combination of the parameter-efficient Convolutional Adapter and the compute-efficient FTA. Experiments on ESC-50, SpeechCommandsV2, and Audio-Visual Event benchmarks show that our proposed method efficiently learns new tasks and prevents catastrophic forgetting.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio