2017 INTERSPEECH INTERSPEECH 2017

Using Knowledge Graph and Search Query Click Logs in Statistical Language Model for Speech Recognition

Abstract

This paper demonstrates how Knowledge Graph (KG) and Search Query Click Logs (SQCL) can be leveraged in statistical language models to improve named entity recognition for online speech recognition systems. Due to the missing in the training data, some named entities may be recognized as other common words that have the similar pronunciation. KG and SQCL cover comprehensive and fresh named entities and queries that can be used to mitigate the wrong recognition. First, all the entities located in the same area in KG are clustered together, and the queries that contain the entity names are selected from SQCL as the training data of a geographical statistical language model for each entity cluster. These geographical language models make the unseen named entities less likely to occur during the model training, and can be dynamically switched according to the user location in the recognition phase. Second, if any named entities are identified in the previous utterances within a conversational dialog, the probability of the n-best word sequence paths that contain their related entities will be increased for the current utterance by utilizing the entity relationships from KG and SQCL. This way can leverage the long-term contexts within the dialog. Experiments for the proposed approach on voice queries from a spoken dialog system yielded a 12.5% relative perplexity reduction in the language model measurement, and a 1.1% absolute word error rate reduction in the speech recognition measurement.

🌉 Interdisciplinary Bridge — Natural Language Processing and Speech & Audio
🧭 Keyword Pioneer — search query click log
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors