Task Specific Sentence Embeddings for ASR Error Detection

Sahar Ghannay; Yannick Estève; Nathalie Camelin

2018 INTERSPEECH INTERSPEECH 2018

Task Specific Sentence Embeddings for ASR Error Detection

Abstract

This paper presents a study on the modeling of automatic speech recognition errors at the sentence level. We aim in this study to compensate certain phenomena highlighted by the analysis of the outputs generated by the ASR error detection system we previously proposed. We investigated three different approaches, that are based respectively on the use of sentence embeddings dedicated to ASR error detection task, a probabilistic contextual model and a bidirectional long short term memory (BLSTM) architecture. An approach to build task-specific sentence embeddings is proposed and compared to the Doc2vec approach. Experiments are performed on transcriptions generated by the LIUM ASR system applied to the ETAPE corpus. They show that the proposed sentence embeddings dedicated to ASR error detection achieve better results than generic sentence embeddings and that the integration of task-specific sentence embeddings in our system achieves better results than the probabilistic contextual model and BLSTM models.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — bidirectional long short term memory

🐣 Hot Topic Early Bird — error detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sahar Ghannay , Yannick Estève , Nathalie Camelin

Topics

Machine Learning > Core Methods > Representation Learning Speech & Audio > Recognition > Automatic Speech Recognition

Keywords

automatic speech recognition error detection sentence embedding bidirectional long short term memory

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018