Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Integrated Perception and Planning in the Continuous Space: A POMDP Approach
RSS 2013
Counterfactual Reasoning and Learning Systems: The Example of Computational Advertising
JMLR 2013
Optimistic policy iteration and natural actor-critic: A unifying view and a non-optimality result
NIPS 2013
Online learning in episodic Markovian decision processes by relative entropy policy search
NIPS 2013
Approximate Dynamic Programming Finally Performs Well in the Game of Tetris
NIPS 2013
Bellman Error Based Feature Generation using Random Projections on Sparse Spaces
NIPS 2013
Efficient Exploration and Value Function Generalization in Deterministic Systems
NIPS 2013
Policy Shaping: Integrating Human Feedback with Reinforcement Learning
NIPS 2013
Learning Adaptive Value of Information for Structured Prediction
NIPS 2013
Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
NIPS 2013
Weighted Likelihood Policy Search with Model Selection
NIPS 2012
Hierarchical Optimistic Region Selection driven by Curiosity
NIPS 2012
Cost-Sensitive Exploration in Bayesian Reinforcement Learning
NIPS 2012
Analysis of Thompson Sampling for the Multi-armed Bandit Problem
COLT 2012
A Stochastic Bandit Algorithm for Scratch Games
ACML 2012
Tendon-Driven Variable Impedance Control Using Reinforcement Learning
RSS 2012
Real-Time Inverse Dynamics Learning for Musculoskeletal Robots based on Echo State Gaussian Process Regression
RSS 2012
On Stochastic Optimal Control and Reinforcement Learning by Approximate Inference
RSS 2012
Variational Bayesian Optimization for Runtime Risk-Sensitive Control
RSS 2012
The Best of Both Worlds: Stochastic and Adversarial Bandits
COLT 2012
Autonomous Exploration For Navigating In MDPs
COLT 2012
Risk Aversion in Markov Decision Processes via Near Optimal Chernoff Bounds
NIPS 2012
Robustness and risk-sensitivity in Markov decision processes
NIPS 2012
Tractable Objectives for Robust Policy Optimization
NIPS 2012
Sketch-Based Linear Value Function Approximation
NIPS 2012
<
1
…
111
112
113
…
118
>