2022 INTERSPEECH INTERSPEECH 2022

The CLIPS System for 2022 Spoofing-Aware Speaker Verification Challenge

Abstract

In this paper, a spoofing-aware speaker verification (SASV) system that integrates the automatic speaker verification (ASV) system and countermeasure (CM) system is developed. Firstly, a modified re-parameterized VGG (ARepVGG) module is utilized to extract high-level representation from the multi-scale feature that learns from the raw waveform though sinc-filters, and then a spectra-temporal graph attention network is used to learn the final decision information whether the audio is spoofed or not. Secondly, a new network that is inspired from the MaxFeature-Map (MFM) layers is constructed to fine-tune the CM system while keeping the ASV system fixed. Our proposed SASV system significantly improves the SASV equal error rate (SASV-EER) from 6.73% to 1.36% on the evaluation dataset and 4.85% to 0.98% on the development dataset in the 2022 Spoofing-Aware Speaker Verification Challenge(2022 SASV).

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio