Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Machine Learning
›
Learning Types
›
Reinforcement Learning
2932 directly classified papers
Papers per year
2003: 1
2006: 11
2007: 18
2008: 23
2009: 14
2010: 22
2011: 24
2012: 34
2013: 26
2014: 24
2015: 14
2016: 23
2017: 79
2018: 182
2019: 255
2020: 284
2021: 333
2022: 319
2023: 315
2024: 457
2025: 419
2026: 55
Papers
Multi-Step Dyna Planning for Policy Evaluation and Control
NIPS 2009
Robust Value Function Approximation Using Bilinear Programming
NIPS 2009
Monte Carlo Sampling for Regret Minimization in Extensive Games
NIPS 2009
Learning to Explore and Exploit in POMDPs
NIPS 2009
Fitted Q-iteration by Advantage Weighted Regression
NIPS 2008
Bounding Performance Loss in Approximate MDP Homomorphisms
NIPS 2008
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
NIPS 2008
Learning to Manipulate Articulated Objects in Unstructured Environments Using a Grounded Relational Representation
RSS 2008
A computational model of hippocampal function in trace conditioning
NIPS 2008
SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces
RSS 2008
Optimization on a Budget: A Reinforcement Learning Approach
NIPS 2008
Regularized Policy Iteration
NIPS 2008
Structure Learning in Human Sequential Decision-Making
NIPS 2008
Psychiatry: Insights into depression through normative decision-making models
NIPS 2008
Multi-resolution Exploration in Continuous Spaces
NIPS 2008
Hebbian Learning of Bayes Optimal Decisions
NIPS 2008
Biasing Approximate Dynamic Programming with a Lower Discount Factor
NIPS 2008
MDPs with Non-Deterministic Policies
NIPS 2008
Particle Filter-based Policy Gradient in POMDPs
NIPS 2008
Skill Characterization Based on Betweenness
NIPS 2008
Value Function Approximation using Multiple Aggregation for Multiattribute Resource Management
JMLR 2008
On the asymptotic equivalence between differential Hebbian and temporal difference learning using a local third factor
NIPS 2008
Bridging the Gap of Abstraction for Probabilistic Decision Making on a Multi-Modal Service Robot
RSS 2008
Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
NIPS 2008
A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
NIPS 2008
<
1
…
114
115
116
117
118
>