Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
NIPS 2008
SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces
RSS 2008
Learning to Use Working Memory in Partially Observable Environments through Dopaminergic Reinforcement
NIPS 2008
Bounding Performance Loss in Approximate MDP Homomorphisms
NIPS 2008
Near-optimal Regret Bounds for Reinforcement Learning
NIPS 2008
Optimization on a Budget: A Reinforcement Learning Approach
NIPS 2008
Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
NIPS 2008
Particle Filter-based Policy Gradient in POMDPs
NIPS 2008
Biasing Approximate Dynamic Programming with a Lower Discount Factor
NIPS 2008
Structure Learning in Human Sequential Decision-Making
NIPS 2008
Multi-resolution Exploration in Continuous Spaces
NIPS 2008
A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
NIPS 2008
Temporal Difference Updating without a Learning Rate
NIPS 2007
Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods
NIPS 2007
Scan Strategies for Meteorological Radars
NIPS 2007
Random Sampling of States in Dynamic Programming
NIPS 2007
Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning
NIPS 2007
Stable Dual Dynamic Programming
NIPS 2007
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
NIPS 2007
Exponential Family Predictive Representations of State
NIPS 2007
Online Linear Regression and Its Application to Model-Based Reinforcement Learning
NIPS 2007
Receding Horizon Differential Dynamic Programming
NIPS 2007
Fitted Q-iteration in continuous action-space MDPs
NIPS 2007
Incremental Natural Actor-Critic Algorithms
NIPS 2007
What makes some POMDP problems easy to approximate?
NIPS 2007
<
1
…
151
152
153
154
155
>