LSTMs Exploit Linguistic Attributes of Data

Nelson F. Liu; Omer Levy; Roy Schwartz; Chenhao Tan; Noah A. Smith

2018 ACL ACL 2018

LSTMs Exploit Linguistic Attributes of Data

Abstract

AbstractWhile recurrent neural networks have found success in a variety of natural language processing applications, they are general models of sequential data. We investigate how the properties of natural language data affect an LSTM’s ability to learn a nonlinguistic task: recalling elements from its input. We find that models trained on natural language data are able to recall tokens from much longer sequences than models trained on non-language sequential data. Furthermore, we show that the LSTM learns to solve the memorization task by explicitly using a subset of its neurons to count timesteps in the input. We hypothesize that the patterns and structure in natural language data enable LSTMs to learn by providing approximate ways of reducing loss, but understanding the effect of different training data on the learnability of LSTMs remains an open question.

🌉 Interdisciplinary Bridge — Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — sequence memorization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nelson F. Liu , Omer Levy , Roy Schwartz , Chenhao Tan , Noah A. Smith

Topics

Machine Learning > Optimization & Theory > Learning Theory Machine Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Architectures > Neural Networks Interdisciplinary > Linguistics > Computational Linguistics Deep Learning > Learning Types > Representation Learning Deep Learning > Architectures > Recurrent Neural Networks Natural Language Processing > Understanding > Lexical Semantics

Keywords

sequence modeling natural language processing long short-term memory recurrent neural network language processing sequence memorization timestep counting

Download PDF

Related papers

Economic Event Detection in Company-Specific News Text 2018

Investigating Effective Parameters for Fine-tuning of Word Embeddings Using Only a Small Corpus 2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment 2018

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer 2018

Affordances in Grounded Language Learning 2018