Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Threshold UCT: Cost-Constrained Monte Carlo Tree Search with Pareto Curves
AAAI 2025
Bootstrapped Reward Shaping
AAAI 2025
The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards
AAAI 2025
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
AAAI 2025
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
ACL 2025
Deep Implicit Imitation Reinforcement Learning in Heterogeneous Action Settings
AAAI 2025
FedAA: A Reinforcement Learning Perspective on Adaptive Aggregation for Fair and Robust Federated Learning
AAAI 2025
Probabilistic Shielding for Safe Reinforcement Learning
AAAI 2025
DUO: Diverse, Uncertain, On-Policy Query Generation and Selection for Reinforcement Learning from Human Feedback
AAAI 2025
GLAM: Global-Local Variation Awareness in Mamba-based World Model
AAAI 2025
Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies
AAAI 2025
Enhancing AMR Parsing with Group Relative Policy Optimization
ACL 2025
Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL
ACL 2025
Efficient Reinforcement Learning in Probabilistic Reward Machines
AAAI 2025
Query-efficient Attack for Black-box Image Inpainting Forensics via Reinforcement Learning
AAAI 2025
Reducing AUV Energy Consumption Through Dynamic Sensor Directions Switching via Deep Reinforcement Learning
AAAI 2025
Removing Prompt-template Bias in Reinforcement Learning from Human Feedback
ACL 2025
Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning
AAAI 2025
Noise-Resilient Symbolic Regression with Dynamic Gating Reinforcement Learning
AAAI 2025
Epistemic Bellman Operators
AAAI 2025
SMoSE: Sparse Mixture of Shallow Experts for Interpretable Reinforcement Learning in Continuous Control Tasks
AAAI 2025
Adversarial Preference Learning for Robust LLM Alignment
ACL 2025
Intelligent OPC Engineer Assistant for Semiconductor Manufacturing
AAAI 2025
Learning Joint Behaviors with Large Variations
AAAI 2025
Partially Observable Reference Policy Programming
IJCAI 2025
<
1
…
9
10
11
…
155
>