Voice Query Auto Completion

Raphael Tang; Karun Kumar; Kendra Chalkley; Ji Xin; Liming Zhang; Wenyan Li; Gefei Yang; Yajie Mao; Junho Shin; Geoffrey Craig Murray; Jimmy Lin

2021 EMNLP EMNLP 2021

Voice Query Auto Completion

Abstract

AbstractQuery auto completion (QAC) is the task of predicting a search engine user’s final query from their intermediate, incomplete query. In this paper, we extend QAC to the streaming voice search setting, where automatic speech recognition systems produce intermediate transcriptions as users speak. Naively applying existing methods fails because the intermediate transcriptions often don’t form prefixes or even substrings of the final transcription. To address this issue, we propose to condition QAC approaches on intermediate transcriptions to complete voice queries. We evaluate our models on a speech-enabled smart television with real-life voice search traffic, finding that this ASR-aware conditioning improves the completion quality. Our best method obtains an 18% relative improvement in mean reciprocal rank over previous methods.

🌉 Interdisciplinary Bridge — Machine Learning and Speech & Audio

🧭 Keyword Pioneer — voice query auto completion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Raphael Tang , Karun Kumar , Kendra Chalkley , Ji Xin , Liming Zhang , Wenyan Li , Gefei Yang , Yajie Mao , Junho Shin , Geoffrey Craig Murray , Jimmy Lin

Topics

Machine Learning > Application Areas > Domain Adaptation Speech & Audio > Recognition > Speech Recognition

Keywords

automatic speech recognition streaming speech voice query auto completion query completion

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021