Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Offline RL
725 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 1
2012: 2
2014: 3
2015: 2
2016: 6
2017: 4
2018: 8
2019: 29
2020: 60
2021: 105
2022: 129
2023: 187
2024: 126
2025: 37
2026: 22
Papers
Confident Least Square Value Iteration with Local Access to a Simulator
AISTATS 2022
Marginalized Operators for Off-policy Reinforcement Learning
AISTATS 2022
Enhanced Meta Reinforcement Learning via Demonstrations in Sparse Reward Environments
NIPS 2022
LobsDICE: Offline Learning from Observation via Stationary Distribution Correction Estimation
NIPS 2022
Supervised Off-Policy Ranking
ICML 2022
On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation
ICML 2022
Exploit Reward Shifting in Value-Based Deep-RL: Optimistic Curiosity-Based Exploration and Conservative Exploitation via Linear Reward Shaping
NIPS 2022
Off-Policy Risk Assessment for Markov Decision Processes
AISTATS 2022
Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets
ICML 2022
Skills Regularized Task Decomposition for Multi-task Offline Reinforcement Learning
NIPS 2022
On the role of overparameterization in off-policy Temporal Difference learning with linear function approximation
NIPS 2022
A Policy-Guided Imitation Approach for Offline Reinforcement Learning
NIPS 2022
Stochastic Zeroth-Order Optimization under Nonstationarity and Nonconvexity
JMLR 2022
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
JMLR 2022
Truncated Emphatic Temporal Difference Methods for Prediction and Control
JMLR 2022
Online Decision Transformer
ICML 2022
Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory
ICML 2022
Robust Task Representations for Offline Meta-Reinforcement Learning via Contrastive Learning
ICML 2022
Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems
NAACL 2022
Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models
NIPS 2022
How to Leverage Unlabeled Data in Offline Reinforcement Learning
ICML 2022
Regularizing a Model-based Policy Stationary Distribution to Stabilize Offline Reinforcement Learning
ICML 2022
DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning
NIPS 2022
Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination
NIPS 2022
Discriminator-Weighted Offline Imitation Learning from Suboptimal Demonstrations
ICML 2022
<
1
…
19
20
21
…
29
>