Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Generalization and Exploration via Randomized Value Functions
ICML 2016
Model-Free Imitation Learning with Policy Optimization
ICML 2016
Near Optimal Behavior via Approximate State Abstraction
ICML 2016
Safe Policy Improvement by Minimizing Robust Baseline Regret
NIPS 2016
Approximate Newton Methods for Policy Search in Markov Decision Processes
JMLR 2016
Regularized Policy Iteration with Nonparametric Function Spaces
JMLR 2016
Modelling Policies in MDPs in Reproducing Kernel Hilbert Space
AISTATS 2015
Grounding English Commands to Reward Functions
RSS 2015
Predictive Inverse Optimal Control for Linear-Quadratic-Gaussian Systems
AISTATS 2015
Learning Continuous Control Policies by Stochastic Value Gradients
NIPS 2015
Sample Complexity Bounds for Iterative Stochastic Policy Optimization
NIPS 2015
Direct Loss Minimization Inverse Optimal Control
RSS 2015
Sample Efficient Path Integral Control under Uncertainty
NIPS 2015
Policy Search for Multi-Robot Coordination under Uncertainty
RSS 2015
Inverse Reinforcement Learning with Locally Consistent Reward Functions
NIPS 2015
Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
NIPS 2015
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret
ICML 2015
High Confidence Policy Improvement
ICML 2015
A Comprehensive Survey on Safe Reinforcement Learning
JMLR 2015
Counterfactual Risk Minimization: Learning from Logged Bandit Feedback
ICML 2015
Non-Stationary Approximate Modified Policy Iteration
ICML 2015
Policy Gradient for Coherent Risk Measures
NIPS 2015
Thompson Sampling for Learning Parameterized Markov Decision Processes
COLT 2015
Learning to Track: Online Multi-Object Tracking by Decision Making
ICCV 2015
Trust Region Policy Optimization
ICML 2015
<
1
…
75
76
77
…
83
>