Speech & Audio › Analysis ›

Speech Analysis

998 directly classified papers

Papers per year

Papers

BRSpeech-DF: A Deep Fake Synthetic Speech Dataset for Portuguese Zero-Shot TTS EMNLP 2025

SPACER: A Parallel Dataset of Speech Production And Comprehension of Error Repairs NAACL 2025

StandUp4AI: A New Multilingual Dataset for Humor Detection in Stand-up Comedy Videos EMNLP 2025

Supervising Sound Localization by In-the-wild Egomotion CVPR 2025

EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights CVPR 2025

Visual Cues Enhance Predictive Turn-Taking for Two-Party Human Interaction ACL 2025

Phonotomizer: A Compact, Unsupervised, Online Training Approach to Real-Time, Multilingual Phonetic Segmentation ACL 2025

English-based acoustic models perform well in the forced alignment of two English-based Pacific Creoles ACL 2025

VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models ACL 2025

Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution ACL 2025

Eta-WavLM: Efficient Speaker Identity Removal in Self-Supervised Speech Representations Using a Simple Linear Equation ACL 2025

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation ACL 2025

Towards a Real-time Swedish Speech Analyzer for Language Learning Games: A Hybrid AI Approach to Language Assessment ACL 2025

MAD Speech: Measures of Acoustic Diversity of Speech NAACL 2025

Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models EMNLP 2025

Regional Distribution of the /el/-/æl/ Merger in Australian English COLING 2025

EmoTa: A Tamil Emotional Speech Dataset COLING 2025

Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection NAACL 2025

SSNTrio @ DravidianLangTech 2025: Hybrid Approach for Hate Speech Detection in Dravidian Languages with Text and Audio Modalities NAACL 2025

Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge NAACL 2025

2M-BELEBELE: Highly Multilingual Speech and American Sign Language Comprehension Dataset Download PDF ACL 2025

Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs EMNLP 2025

Generative Annotation for ASR Named Entity Correction EMNLP 2025

Summarizing Speech: A Comprehensive Survey EMNLP 2025

MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines ACL 2025