Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
V-Pruner: A Fast and Globally-informed Token Pruning Framework for Vision Transformer
AAAI 2026
PQDA:Policy-Aligned Q-Consistency Meets Decoupled Augmentation for Generalizable Visual RL
AAAI 2026
Achieving Equilibrium Under Utility Heterogeneity: An Agent-Attention Framework for Multi-Agent Multi-Objective Reinforcement Learning
AAAI 2026
HCPO: Hierarchical Conductor-Based Policy Optimization in Multi-Agent Reinforcement Learning
AAAI 2026
AcoustoReinforce: Multi-Particle Acoustophoretic Path Planning with Deep Reinforcement Learning
AAAI 2026
Test-Time Reinforcement Learning for GUI Grounding via Region Consistency
AAAI 2026
Reward Redistribution via Gaussian Process Likelihood Estimation
AAAI 2026
A Unified Self-Regulating Training Framework for Federated Deep Reinforcement Learning
AAAI 2026
Deep Reinforcement Learning for Scalable Offline Three-Dimensional Packing
AAAI 2026
MetaTrader: Learning to Generalize RL Trading Policies Beyond Offline Data
AAAI 2026
FDC-Ground: Improving GRPO for GUI Grounding via Exponential Rewards and Fact-Aligned Pruning
AAAI 2026
Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization
AAAI 2026
Beyond Training-time Poisoning: Component-level and Post-training Backdoors in Deep Reinforcement Learning
AAAI 2026
Beyond Single-Speed Reasoning: Coordinating Fast and Slow Dynamics for Efficient World Modeling
AAAI 2026
Latent State-Predictive Exploration for Deep Reinforcement Learning
AAAI 2026
Explore to Learn: Latent Exploration Through Disentangled Synergy Patterns for Reinforcement Learning in Overactuated Control
AAAI 2026
DSAP: Enhancing Generalization in Goal-Conditioned Reinforcement Learning
AAAI 2026
One-Step Generative Policies with Q-Learning: A Reformulation of MeanFlow
AAAI 2026
CHDP: Cooperative Hybrid Diffusion Policies for Reinforcement Learning in Parameterized Action Space
AAAI 2026
Perceiving the Knowledge Boundary: Uncertainty-Guided Exploration and Imagination for World Models
AAAI 2026
Context-Sensitive Abstractions for Reinforcement Learning with Parameterized Actions
AAAI 2026
Do It for HER: First-Order Temporal Logic Reward Specification in Reinforcement Learning
AAAI 2026
Reliability-Guaranteed and Reward-Seeking Sequence Modeling for Model-Based Offline Reinforcement Learning
AAAI 2026
Enhancing Diffusion Policies with Distribution-Matching Generator in Offline Reinforcement Learning
AAAI 2026
Policy Zooming: Adaptive Discretization-based Infinite-Horizon Average-Reward Reinforcement Learning
AAAI 2026
<
1
2
3
4
5
…
155
>