Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Offline RL
725 directly classified papers
Papers per year
2007: 2
2008: 1
2009: 1
2011: 1
2012: 2
2014: 3
2015: 2
2016: 6
2017: 4
2018: 8
2019: 29
2020: 60
2021: 105
2022: 129
2023: 187
2024: 126
2025: 37
2026: 22
Papers
MetaTrader: Learning to Generalize RL Trading Policies Beyond Offline Data
AAAI 2026
Offline Multi-Objective Bandits: From Logged Data to Pareto-Optimal Policies
AAAI 2026
Reliability-Guaranteed and Reward-Seeking Sequence Modeling for Model-Based Offline Reinforcement Learning
AAAI 2026
One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow
AAAI 2026
Meta-Normalizing Flow for Data-Limited Offline Meta-Reinforcement Learning (Student Abstract)
AAAI 2026
On the Exponential Convergence for Offline RLHF with Pairwise Comparisons
AAAI 2026
Balancing Signal and Variance: Adaptive Offline RL Post-Training for VLA Flow Models
AAAI 2026
Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning
AAAI 2026
Partial Action Replacement: Tackling Distribution Shift in Offline MARL
AAAI 2026
Soft Conflict-Resolution Decision Transformer for Offline Multi-Task Reinforcement Learning
AAAI 2026
Advancing Safe Mechanical Ventilation Using Offline RL with Hybrid Actions and Clinically Aligned Rewards
AAAI 2026
Human-in-the-Loop Bandwidth Estimation for Quality of Experience Optimization in Real-Time Video Communication
AAAI 2026
Trajectory Tactics: When Transformers Learn Exploration to Generate Online Signature
WACV 2026
Enhancing Robustness of Offline Reinforcement Learning Under Data Corruption via Sharpness-Aware Minimization (Student Abstract)
AAAI 2026
UNO! UNified Offline Training Paradigm for Learning Path Recommendation
AAAI 2026
Treatment Stitching with Schrödinger Bridge for Enhancing Offline Reinforcement Learning in Adaptive Treatment Strategies
AAAI 2026
SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories
AAAI 2026
Benchmarking Reinforcement Learning Algorithms for ICU Ventilator Settings: An Interpretable and Probabilistic Patient Environment for Doctor Agents
AAAI 2026
Enhancing Diffusion Policies with Distribution-Matching Generator in Offline Reinforcement Learning
AAAI 2026
Variational OOD State Correction for Offline Reinforcement Learning
AAAI 2026
State Proficiency-Based Adaptive Fine-Tuning for Offline-to-Online Reinforcement Learning
AAAI 2026
Offline Meta-Reinforcement Learning with Flow-Based Task Inference and Adaptive Correction of Feature Overgeneralization
AAAI 2026
Evaluation of Active Feature Acquisition Methods for Time-varying Feature Settings
JMLR 2025
Direct Value Optimization: Improving Chain-of-Thought Reasoning in LLMs with Refined Values
EMNLP 2025
Cooperative Policy Agreement: Learning Diverse Policy for Offline MARL
AAAI 2025
<
1
2
3
4
5
…
29
>