Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Mutant: Learning Congestion Control from Existing Protocols via Online Reinforcement Learning
NSDI 2025
GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks
CVPR 2025
Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
CVPR 2025
ViUniT: Visual Unit Tests for More Robust Visual Programming
CVPR 2025
Automated Proof of Polynomial Inequalities via Reinforcement Learning
CVPR 2025
Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
CVPR 2025
KERLQA: Knowledge-Enhanced Reinforcement Learning for Question Answering in Low-resource Languages
IJCNLP 2025
Minority-Aware Satisfaction Estimation in Dialogue Systems via Preference-Adaptive Reinforcement Learning
IJCNLP 2025
Risk-averse Total-reward MDPs with ERM and EVaR
AAAI 2025
Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning
AAAI 2025
Maximum Entropy Softmax Policy Gradient via Entropy Advantage Estimation
IJCAI 2025
SPoRt - Safe Policy Ratio: Certified Training and Deployment of Task Policies in Model-Free RL
IJCAI 2025
S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning
IJCAI 2025
Rule-Guided Reinforcement Learning Policy Evaluation and Improvement
IJCAI 2025
Preference-based Deep Reinforcement Learning for Historical Route Estimation
IJCAI 2025
Imitation Learning via Focused Satisficing
IJCAI 2025
ASP-Driven Emergency Planning for Norm Violations in Reinforcement Learning
AAAI 2025
Bootstrapped Reward Shaping
AAAI 2025
Deep Implicit Imitation Reinforcement Learning in Heterogeneous Action Settings
AAAI 2025
Pareto Set Learning for Multi-Objective Reinforcement Learning
AAAI 2025
Efficient Multi-Policy Evaluation for Reinforcement Learning
AAAI 2025
On Corruption-Robustness in Performative Reinforcement Learning
AAAI 2025
Decoupled Policy Actor-Critic: Bridging Pessimism and Risk Awareness in Reinforcement Learning
AAAI 2025
Discovering Options That Minimize Average Planning Time
AAAI 2025
Indirect Online Preference Optimization via Reinforcement Learning
IJCAI 2025
<
1
…
6
7
8
…
83
>