Unsupervised feature learning for audio classification using convolutional deep belief networks

Honglak Lee; Peter Pham; Yan Largman; Andrew Y. Ng

2009 NIPS NeurIPS 2009

Unsupervised feature learning for audio classification using convolutional deep belief networks

Abstract

In recent years, deep learning approaches have gained significant interest as a way of building hierarchical representations from unlabeled data. However, to our knowledge, these deep learning approaches have not been extensively studied for auditory data. In this paper, we apply convolutional deep belief networks to audio data and empirically evaluate them on various audio classification tasks. For the case of speech data, we show that the learned features correspond to phones/phonemes. In addition, our feature representations trained from unlabeled audio data show very good performance for multiple audio classification tasks. We hope that this paper will inspire more research on deep learning approaches applied to a wide range of audio recognition tasks.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Speech & Audio

📈 Trend Setter — Speech Recognition

🧭 Keyword Pioneer — unsupervised feature learning

🐣 Hot Topic Early Bird — speech recognition

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Honglak Lee , Peter Pham , Yan Largman , Andrew Y. Ng

Topics

Machine Learning > Learning Types > Unsupervised Learning Deep Learning > Architectures > Neural Networks Speech & Audio > Recognition > Speech Recognition Speech & Audio > Analysis > Speech Analysis

Keywords

unsupervised learning feature learning speech recognition unsupervised feature learning feature representation hierarchical representation audio classification speech data deep belief network convolutional deep belief network

Download PDF

Related papers

Solving Stochastic Games 2009

Bilinear classifiers for visual recognition 2009

Zero-shot Learning with Semantic Output Codes 2009

Matrix Completion from Power-Law Distributed Samples 2009

Heavy-Tailed Symmetric Stochastic Neighbor Embedding 2009