2008
NIPS
NeurIPS 2008
Particle Filter-based Policy Gradient in POMDPs
Abstract
Our setting is a Partially Observable Markov Decision Process with continuous state, observation and action spaces. Decisions are based on a Particle Filter for estimating the belief state given past observations. We consider a policy gradient approach for parameterized policy optimization. For that purpose, we investigate sensitivity analysis of the performance measure with respect to the parameters of the policy, focusing on Finite Difference (FD) techniques. We show that the naive FD is subject to variance explosion because of the non-smoothness of the resampling procedure. We propose a more sophisticated FD method which overcomes this problem and establish its consistency.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Machine Learning and Mathematics & Optimization and Reinforcement Learning
📈
Trend Setter
— Stochastic Methods
🧭
Keyword Pioneer
— belief state estimation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Deep Learning, Healthcare & Medicine, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🌱
Topic Pioneer
— Exploration
🐣
Hot Topic Early Bird
— reinforcement learning
Authors
Topics
Artificial Intelligence > Core AI > Agent Systems
Artificial Intelligence > Core AI > Planning
Machine Learning > Optimization & Theory > Stochastic Processes
Reinforcement Learning > Methods > Deep RL
Reinforcement Learning > Methods > Policy Learning
Mathematics & Optimization > Optimization > Stochastic Methods
Machine Learning > Learning Types > Reinforcement Learning
Machine Learning > Optimization & Theory > Stochastic Methods
Machine Learning > Learning Types > Exploration