2024 INTERSPEECH INTERSPEECH 2024

Adapter Learning from Pre-trained Model for Robust Spoof Speech Detection

Abstract

Speech anti-spoofing models can be improved by using large pre-trained model as front-end, e.g., Wav2vec2 or WavLM. However, apart from the heavy computation overhead, fine-tuning of pre-trained model is prone to over-fitting and catastrophic forgetting due to limited training data. In this paper, we propose an novel adapter learning framework based on pre-trained model for robust spoof speech detection. We consider two adapter cases, i.e., intra-block adapters and cross-block adapters, which are inserted or appended to the backbone Wav2vec2. The parameters of the adapters are updated by freezing the backbone during training. The local-global task-dependent information for spoof speech detection is obtained via the proposed adapter learning with a marginal increase of parameters. Results on three benchmark datasets validate the superiority over the baseline and existing SOTA systems.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🧭 Keyword Pioneer — spoof speech detection
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Speech & Audio