2010
NIPS
NeurIPS 2010
Policy gradients in linearly-solvable MDPs
Abstract
We present policy gradient results within the framework of linearly-solvable MDPs. For the first time, compatible function approximators and natural policy gradients are obtained by estimating the cost-to-go function, rather than the (much larger) state-action advantage function as is necessary in traditional MDPs. We also develop the first compatible function approximators and natural policy gradients for continuous-time stochastic systems.
🧭
Keyword Pioneer
— linearly-solvable mdps
🐣
Hot Topic Early Bird
— reinforcement learning
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics
🌉
Interdisciplinary Bridge
— Machine Learning and Mathematics & Optimization and Reinforcement Learning
📈
Trend Setter
— Optimal Control