A Discriminative Entity-Aware Language Model for Virtual Assistants

Mandana Saebi; Ernest Pusateri; Aaksha Meghawat; Christophe Van Gysel

2021 INTERSPEECH INTERSPEECH 2021

A Discriminative Entity-Aware Language Model for Virtual Assistants

Abstract

High-quality automatic speech recognition (ASR) is essential for virtual assistants (VAs) to work well. However, ASR often performs poorly on VA requests containing named entities. In this work, we start from the observation that many ASR errors on named entities are inconsistent with real-world knowledge. We extend previous discriminative n-gram language modeling approaches to incorporate real-world knowledge from a Knowledge Graph (KG), using features that capture entity type-entity and entity-entity relationships. We apply our model through an efficient lattice rescoring process, achieving relative sentence error rate reductions of more than 25% on some synthesized test sets covering less popular entities, with minimal degradation on a uniformly sampled VA test set.

🌉 Interdisciplinary Bridge — Natural Language Processing and Speech & Audio

🧭 Keyword Pioneer — discriminative language model

🐣 Hot Topic Early Bird — knowledge graph

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Mandana Saebi , Ernest Pusateri , Aaksha Meghawat , Christophe Van Gysel

Topics

Natural Language Processing > Generation > Language Modeling Knowledge & Reasoning > Representation > Knowledge Graphs Speech & Audio > Recognition > Automatic Speech Recognition Artificial Intelligence > Core AI > Knowledge Graph

Keywords

speech recognition knowledge graph named entity lattice rescoring virtual assistant discriminative language model entity-aware model

Download PDF

Related papers

Energy-Friendly Keyword Spotting System Using Add-Based Convolution 2021

Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information 2021

Using Games to Augment Corpora for Language Recognition and Confusability 2021

A Psychology-Driven Computational Analysis of Political Interviews 2021

The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results 2021