Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods

Alessandro Lazaric; Marcello Restelli; Andrea Bonarini

2007 NIPS NeurIPS 2007

Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods

Abstract

Learning in real-world domains often requires to deal with continuous state and action spaces. Although many solutions have been proposed to apply Reinforce- ment Learning algorithms to continuous state problems, the same techniques can be hardly extended to continuous action spaces, where, besides the computation of a good approximation of the value function, a fast method for the identiﬁcation of the highest-valued action is needed. In this paper, we propose a novel actor-critic approach in which the policy of the actor is estimated through sequential Monte Carlo methods. The importance sampling step is performed on the basis of the values learned by the critic, while the resampling step modiﬁes the actor’s policy. The proposed approach has been empirically compared to other learning algo- rithms into several domains; in this paper, we report results obtained in a control problem consisting of steering a boat across a river.

🌉 Interdisciplinary Bridge — Machine Learning and Mathematics & Optimization and Reinforcement Learning

📈 Trend Setter — Stochastic Methods

🧭 Keyword Pioneer — policy learning

🐣 Hot Topic Early Bird — reinforcement learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics

Authors

Alessandro Lazaric , Marcello Restelli , Andrea Bonarini

Topics

Reinforcement Learning > Methods > Deep RL Reinforcement Learning > Methods > Policy Learning Mathematics & Optimization > Optimization > Stochastic Methods Machine Learning > Learning Types > Reinforcement Learning Machine Learning > Optimization & Theory > Stochastic Methods Mathematics & Optimization > Optimization > Optimal Control

Keywords

reinforcement learning policy learning sequential monte carlo continuous action space actor-critic method continuous action

Download PDF

Related papers

Exponential Family Predictive Representations of State 2007

Privacy-Preserving Belief Propagation and Sampling 2007

Efficient Principled Learning of Thin Junction Trees 2007

How SVMs can estimate quantiles and the median 2007

Rapid Inference on a Novel AND/OR graph for Object Detection, Segmentation and Parsing 2007