Word-Level Speech Recognition With a Letter to Word Encoder

Ronan Collobert; Awni Hannun; Gabriel Synnaeve

2020 ICML ICML 2020

Word-Level Speech Recognition With a Letter to Word Encoder

Abstract

We propose a direct-to-word sequence model which uses a word network to learn word embeddings from letters. The word network can be integrated seamlessly with arbitrary sequence models including Connectionist Temporal Classification and encoder-decoder models with attention. We show our direct-to-word model can achieve word error rate gains over sub-word level models for speech recognition. We also show that our direct-to-word approach retains the ability to predict words not seen at training time without any retraining. Finally, we demonstrate that a word-level model can use a larger stride than a sub-word level model while maintaining accuracy. This makes the model more efficient both for training and inference.

🌉 Interdisciplinary Bridge — Deep Learning and Speech & Audio

🧭 Keyword Pioneer — word-level recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ronan Collobert , Awni Hannun , Gabriel Synnaeve

Topics

Deep Learning > Architectures > Neural Networks Speech & Audio > Recognition > Speech Recognition

Keywords

speech recognition connectionist temporal classification encoder-decoder model word-level recognition letter to word encoder out-of-vocabulary prediction

Download PDF

Related papers

Correlation Clustering with Asymmetric Classification Errors 2020

Learning Portable Representations for High-Level Planning 2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need 2020

Minimax Pareto Fairness: A Multi Objective Perspective 2020

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training 2020