End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios

Swapnil Bhosale; Imran Sheikh; Sri Harsha Dumpala; Sunil Kumar Kopparapu

2019 INTERSPEECH INTERSPEECH 2019

End-to-End Spoken Language Understanding: Bootstrapping in Low Resource Scenarios

Abstract

End-to-end Spoken Language Understanding (SLU) systems, without speech-to-text conversion, are more promising in low resource scenarios. They can be more effective when there is not enough labeled data to train reliable speech recognition and language understanding systems, or where running SLU on edge is preferred over cloud based services. In this paper, we present an approach for bootstrapping end-to-end SLU in low resource scenarios. We show that incorporating layers extracted from pre-trained acoustic models, instead of using the typical Mel filter bank features, lead to better performing SLU models. Moreover, the layers extracted from a model pre-trained on one language perform well even for (a) SLU tasks on a different language and also (b) on utterances from speakers with speech disorder.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Swapnil Bhosale , Imran Sheikh , Sri Harsha Dumpala , Sunil Kumar Kopparapu

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Machine Learning > Learning Types > Transfer Learning

Keywords

transfer learning spoken language understanding acoustic model low resource

Download PDF

Related papers

Using Real-Time Visual Biofeedback for Second Language Instruction 2019

VAE-Based Regularization for Deep Speaker Embedding 2019

End-to-End SpeakerBeam for Single Channel Target Speech Recognition 2019

Attention-Enhanced Connectionist Temporal Classification for Discrete Speech Emotion Recognition 2019

Attentive to Individual: A Multimodal Emotion Recognition Network with Personalized Attention Profile 2019