Learning Beam Search Policies via Imitation Learning

Renato Negrinho; Matthew Gormley; Geoffrey J. Gordon

2018 NIPS NeurIPS 2018

Learning Beam Search Policies via Imitation Learning

Abstract

Beam search is widely used for approximate decoding in structured prediction problems. Models often use a beam at test time but ignore its existence at train time, and therefore do not explicitly learn how to use the beam. We develop an unifying meta-algorithm for learning beam search policies using imitation learning. In our setting, the beam is part of the model and not just an artifact of approximate decoding. Our meta-algorithm captures existing learning algorithms and suggests new ones. It also lets us show novel no-regret guarantees for learning beam search policies.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Reinforcement Learning

📈 Trend Setter — Imitation Learning

🧭 Keyword Pioneer — no-regret guarantee

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Renato Negrinho , Matthew Gormley , Geoffrey J. Gordon

Topics

Machine Learning > Optimization & Theory > Optimization Reinforcement Learning > Methods > Policy Learning Machine Learning > Learning Types > Imitation Learning Deep Learning > Learning Types > Imitation Learning Machine Learning > Learning Types > Structured Prediction

Keywords

imitation learning structured prediction policy learning beam search no-regret guarantee

Download PDF

Related papers

Maximum Causal Tsallis Entropy Imitation Learning 2018

Recurrent World Models Facilitate Policy Evolution 2018

Bandit Learning in Concave N-Person Games 2018

Algorithmic Assurance: An Active Approach to Algorithmic Testing using Bayesian Optimisation 2018

PAC-Bayes bounds for stable algorithms with instance-dependent priors 2018