Pre-training of Recurrent Neural Networks via Linear Autoencoders

Luca Pasa; Alessandro Sperduti

2014 NIPS NeurIPS 2014

Pre-training of Recurrent Neural Networks via Linear Autoencoders

Abstract

We propose a pre-training technique for recurrent neural networks based on linear autoencoder networks for sequences, i.e. linear dynamical systems modelling the target sequences. We start by giving a closed form solution for the definition of the optimal weights of a linear autoencoder given a training set of sequences. This solution, however, is computationally very demanding, so we suggest a procedure to get an approximate solution for a given number of hidden units. The weights obtained for the linear autoencoder are then used as initial weights for the input-to-hidden connections of a recurrent neural network, which is then trained on the desired task. Using four well known datasets of sequences of polyphonic music, we show that the proposed pre-training approach is highly effective, since it allows to largely improve the state of the art results on all the considered datasets.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning

📈 Trend Setter — Pretraining

🧭 Keyword Pioneer — linear autoencoder

🐣 Hot Topic Early Bird — recurrent neural network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Luca Pasa , Alessandro Sperduti

Topics

Machine Learning > Learning Types > Self-Supervised Learning Deep Learning > Architectures > Neural Networks Deep Learning > Techniques > Pretraining Machine Learning > Learning Types > Representation Learning

Keywords

representation learning sequence modeling recurrent neural network linear autoencoder pre-training technique weight initialization

Download PDF

Related papers

Information-based learning by agents in unbounded state spaces 2014

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm 2014

Partition-wise Linear Models 2014

Active Regression by Stratification 2014

Cone-Constrained Principal Component Analysis 2014