2018 ACL ACL 2018

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction

Abstract

AbstractIn this work, we study the credit assignment problem in reward augmented maximum likelihood (RAML) learning, and establish a theoretical equivalence between the token-level counterpart of RAML and the entropy regularized reinforcement learning. Inspired by the connection, we propose two sequence prediction algorithms, one extending RAML with fine-grained credit assignment and the other improving Actor-Critic with a systematic entropy regularization. On two benchmark datasets, we show the proposed algorithms outperform RAML and Actor-Critic respectively, providing new alternatives to sequence prediction.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Reinforcement Learning
📈 Trend Setter — Sequence Modeling
🧭 Keyword Pioneer — reward augmented maximum likelihood
🐣 Hot Topic Early Bird — entropy regularization
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio