KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing

Seonmin Koo; Chanjun Park; JINSUNG KIM; Jaehyung Seo; Sugyeong Eo; Hyeonseok Moon; Heuiseok Lim

2023 EMNLP EMNLP 2023

KEBAP: Korean Error Explainable Benchmark Dataset for ASR and Post-processing

Abstract

AbstractAutomatic Speech Recognition (ASR) systems are instrumental across various applications, with their performance being critically tied to user satisfaction. Conventional evaluation metrics for ASR systems produce a singular aggregate score, which is insufficient for understanding specific system vulnerabilities. Therefore, we aim to address the limitations of the previous ASR evaluation methods by introducing the Korean Error Explainable Benchmark Dataset for ASR and Post-processing (KEBAP). KEBAP enables comprehensive analysis of ASR systems at both speech- and text levels, thereby facilitating a more balanced assessment encompassing speech recognition accuracy and user readability. KEBAP provides 37 newly defined speech-level resources incorporating diverse noise environments and speaker characteristics categories, also presenting 13 distinct text-level error types. This paper demonstrates detailed statistical analyses of colloquial noise categories and textual error types. Furthermore, we conduct extensive validation and analysis on commercially deployed ASR systems, providing valuable insights into their performance. As a more fine-grained and real-world-centric evaluation method, KEBAP contributes to identifying and mitigating potential weaknesses in ASR systems.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Speech & Audio

🧭 Keyword Pioneer — user readability

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Seonmin Koo , Chanjun Park , JINSUNG KIM , Jaehyung Seo , Sugyeong Eo , Hyeonseok Moon , Heuiseok Lim

Topics

Artificial Intelligence > Core AI > Interpretability Speech & Audio > Recognition > Automatic Speech Recognition

Keywords

automatic speech recognition benchmark dataset error analysis user readability

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023