Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Offline RL
725 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 1
2012: 2
2014: 3
2015: 2
2016: 6
2017: 4
2018: 8
2019: 29
2020: 60
2021: 105
2022: 129
2023: 187
2024: 126
2025: 37
2026: 22
Papers
Distributionally Robust Model-Based Offline Reinforcement Learning with Near-Optimal Sample Complexity
JMLR 2024
WPO: Enhancing RLHF with Weighted Preference Optimization
EMNLP 2024
Occupancy-based Policy Gradient: Estimation, Convergence, and Optimality
NIPS 2024
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
NIPS 2024
A Tractable Inference Perspective of Offline RL
NIPS 2024
Iteratively Refined Behavior Regularization for Offline Reinforcement Learning
NIPS 2024
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning
AISTATS 2024
Don’t Forget Your Reward Values: Language Model Alignment via Value-based Calibration
EMNLP 2024
Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
NIPS 2024
Multi-Agent Domain Calibration with a Handful of Offline Data
NIPS 2024
Q-Distribution guided Q-learning for offline reinforcement learning: Uncertainty penalized Q-value via consistency model
NIPS 2024
Offline Reinforcement Learning with OOD State Correction and OOD Action Suppression
NIPS 2024
Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning
NIPS 2024
Is Mamba Compatible with Trajectory Optimization in Offline Reinforcement Learning?
NIPS 2024
Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning
NIPS 2024
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning (Abstract Reprint)
AAAI 2024
The Value of Reward Lookahead in Reinforcement Learning
NIPS 2024
Doubly Mild Generalization for Offline Reinforcement Learning
NIPS 2024
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability
NIPS 2024
Scaling Offline Evaluation of Reinforcement Learning Agents through Abstraction
AAAI 2024
Simplifying Complex Observation Models in Continuous POMDP Planning with Probabilistic Guarantees and Practice
AAAI 2024
RL in Latent MDPs is Tractable: Online Guarantees via Off-Policy Evaluation
NIPS 2024
Probabilistic Offline Policy Ranking with Approximate Bayesian Computation
AAAI 2024
Diffusion Policies Creating a Trust Region for Offline Reinforcement Learning
NIPS 2024
Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
NIPS 2024
<
1
2
3
4
5
…
29
>