← Recognition

Speech & Audio › Recognition ›

Automatic Speech Recognition

1789 directly classified papers

Papers per year

Papers

Evaluating Automatic Speech Recognition Systems for Korean Meteorological Experts EMNLP 2025

Out of the Box, into the Clinic? Evaluating State-of-the-Art ASR for Clinical Applications for Older Adults EMNLP 2025

InTriage: Intelligent Telephone Triage in Pre-Hospital Emergency Care EMNLP 2025

BeaverTalk: Oregon State University’s IWSLT 2025 Simultaneous Speech Translation System ACL 2025

Enhancing Audiovisual Speech Recognition Through Bifocal Preference Optimization AAAI 2025

MMS-LLaMA: Efficient LLM-based Audio-Visual Speech Recognition with Minimal Multimodal Speech Tokens ACL 2025

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement ACL 2025

NAIST Offline Speech Translation System for IWSLT 2025 ACL 2025

MERaLiON-AudioLLM: Advancing Speech and Language Understanding for Singapore ACL 2025

Let’s Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Robust and Instruction-Aware ASR and OCR ACL 2025

Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning ACL 2025

Simultaneous Translation with Offline Speech and LLM Models in CUNI Submission to IWSLT 2025 ACL 2025

Soundwave: Less is More for Speech-Text Alignment in LLMs ACL 2025

Fine-tuning Whisper Tiny for Swahili ASR: Challenges and Recommendations for Low-Resource Speech Recognition ACL 2025

Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages ACL 2025

Lost in Transcription, Found in Distribution Shift: Demystifying Hallucination in Speech Foundation Models ACL 2025

IWSLT 2025 Indic Track System Description Paper: Speech-to-Text Translation from Low-Resource Indian Languages (Bengali and Tamil) to English ACL 2025

QUESPA Submission for the IWSLT 2025 Dialectal and Low-resource Speech Translation Task ACL 2025

From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition EMNLP 2025

On the Tolerance of Repetition Before Performance Degradation in Kiswahili Automatic Speech Recognition ACL 2025

Beyond WER: Probing Whisper’s Sub‐token Decoder Across Diverse Language Resource Levels EMNLP 2025

Spoken Conversational Agents with Large Language Models EMNLP 2025

GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation EMNLP 2025

Efficient ASR for Low-Resource Languages: Leveraging Cross-Lingual Unlabeled Data IJCNLP 2025

Beyond Monolingual Limits: Fine-Tuning Monolingual ASR for Yoruba-English Code-Switching NAACL 2025