Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Recognition
Speech & Audio
›
Recognition
›
Speech Recognition
1480 directly classified papers
Papers per year
2002: 1
2006: 3
2007: 1
2008: 2
2009: 3
2010: 3
2012: 2
2013: 1
2015: 1
2016: 125
2017: 99
2018: 100
2019: 161
2020: 163
2021: 182
2022: 174
2023: 213
2024: 152
2025: 88
2026: 6
Papers
Fotheidil: an Automatic Transcription System for the Irish Language
COLING 2025
Phonotomizer: A Compact, Unsupervised, Online Training Approach to Real-Time, Multilingual Phonetic Segmentation
ACL 2025
Curved Worlds, Clear Boundaries: Generalizing Speech Deepfake Detection using Hyperbolic and Spherical Geometry Spaces
IJCNLP 2025
Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI
COLING 2025
kNN For Whisper And Its Effect On Bias And Speaker Adaptation
NAACL 2025
Wenzhou Dialect Speech to Mandarin Text Conversion
NAACL 2025
The Role of Prosody in Spoken Question Answering
NAACL 2025
Distinct social-linguistic processing between humans and large audio-language models: Evidence from model-brain alignment
NAACL 2025
Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information
ICCV 2025
Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations
ICCV 2025
FFSTC 2: Extending the Fongbe to French Speech Translation Corpus
ACL 2025
JU-CSE-NLP’s Cascaded Speech to Text Translation Systems for IWSLT 2025 in Indic Track
ACL 2025
GenPTQ: Green Post-Training Quantization for Large-Scale ASR Models with Mixed-Precision Bit Allocation
EMNLP 2025
MetaMixSpeech: Meta Task Augmentation for Low-Resource Speech Recognition
EMNLP 2025
ASR Under Noise: Exploring Robustness for Sundanese and Javanese
EMNLP 2025
Generative Annotation for ASR Named Entity Correction
EMNLP 2025
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models
ACL 2025
Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks
ACL 2025
DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation
ACL 2025
LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis
ACL 2025
InfiniSST: Simultaneous Translation of Unbounded Speech with Large Language Model
ACL 2025
Slamming: Training a Speech Language Model on One GPU in a Day
ACL 2025
Automatic Speech Recognition for African Low-Resource Languages: Challenges and Future Directions
ACL 2025
YodiV3: NLP for Togolese Languages with Eyaa-Tom Dataset and the Lom Metric
ACL 2025
Visual-Aware Speech Recognition for Noisy Scenarios
EMNLP 2025
<
1
2
3
4
5
…
60
>