Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition

Xingmei Wang; Jinghan Liu; Jiaxiang Meng; Boquan Li; Zijian Liu

2025 AAAI AAAI 2025

Int*-Match: Balancing Intra-Class Compactness and Inter-Class Discrepancy for Semi-Supervised Speaker Recognition

Abstract

Abstract Open-set speaker recognition is to identify whether the voices are from the same speaker. One challenge of speaker recognition is collecting large amounts of high-quality data. Based on the promising results of image classification, one intuitively feasible solution is semi-supervised learning (SSL) which uses confidence thresholds to assign pseudo labels for unlabeled data. However, we empirically demonstrated that applying SSL methods to speaker recognition is non-trivial. These methods focus solely on inter-class discrepancy as thresholds to select pseudo labels, overlooking intra-class compactness, which is particularly important for open-set speaker recognition tasks. Motivated by this, we propose Int*-Match, a semi-supervised speaker recognition method selecting reliable pseudo labels with intra-class compactness and inter-class discrepancy for speaker recognition. In particular, we use the inter-class discrepancy of labeled data as the threshold for pseudo-label selection and adjust the threshold based on the intra-class compactness of the pseudo labels dynamically and adaptively. Our systematic experiments demonstrate the superiority of Int*-Match, presenting an outstanding Equal Error Rate (EER) of 1.00% on the VoxCeleb1 original test set, which is merely 0.06% below the performance achieved by fully supervised learning.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning and Speech & Audio

🧭 Keyword Pioneer — inter-class discrepancy

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Xingmei Wang , Jinghan Liu , Jiaxiang Meng , Boquan Li , Zijian Liu

Topics

Machine Learning > Core Methods > Metric Learning Machine Learning > Learning Types > Semi-Supervised Learning Computer Vision > Analysis > Biometrics Speech & Audio > Analysis > Speaker Verification Machine Learning > Learning Types > Metric Learning Machine Learning > Learning Paradigms > Semi-Supervised Learning

Keywords

metric learning semi-supervised learning pseudo labeling speaker recognition pseudo label equal error rate intra-class compactness inter-class discrepancy

Download PDF

Related papers

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving 2025

APIRL: Deep Reinforcement Learning for REST API Fuzzing 2025

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation 2025

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly Detection 2025

Collaborative Learning for 3D Hand-Object Reconstruction and Compositional Action Recognition from Egocentric RGB Videos Using Superquadrics 2025