Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification

Peng Shen; Xugang Lu; Sheng Li; Hisashi Kawai

2018 INTERSPEECH INTERSPEECH 2018

Feature Representation of Short Utterances Based on Knowledge Distillation for Spoken Language Identification

Abstract

The performance of spoken language identification (LID) on short utterances is drastically degraded even though model is completely trained on short utterance data set. The degradation is because of the large pattern confusion caused by the large variation of feature representation on short utterances. In this paper, we propose a teacher-student network learning algorithm to explore discriminative features for short utterances. With the teacher-student network learning, the feature representation for short utterances (explored by the student network) are normalized to their representations corresponding to long utterances (provided by the teacher network). With this learning algorithm, the feature representation on short utterances is supposed to reduce pattern confusion. Experiments on a 10-language LID task were carried out to test the algorithm. Our results showed the proposed algorithm significantly improved the performance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

📈 Trend Setter — Knowledge Distillation

🧭 Keyword Pioneer — teacher-student network

🐣 Hot Topic Early Bird — knowledge distillation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Peng Shen , Xugang Lu , Sheng Li , Hisashi Kawai

Topics

Machine Learning > Learning Types > Self-Supervised Learning Machine Learning > Application Areas > Knowledge Distillation Deep Learning > Architectures > Neural Networks Speech & Audio > Recognition > Speech Recognition Machine Learning > Learning Types > Knowledge Distillation

Keywords

knowledge distillation speaker recognition feature representation teacher-student network spoken language identification short utterance

Download PDF

Related papers

HoloCompanion: An MR Friend for EveryOne 2018

Estimation of the Vocal Tract Length of Vowel Sounds Based on the Frequency of the Significant Spectral Valley 2018

Deep Learning Techniques for Koala Activity Detection 2018

An Exploration of Local Speaking Rate Variations in Mandarin Read Speech 2018

Acoustic Analysis of Whispery Voice Disguise in Mandarin Chinese 2018