Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
HCPO: Hierarchical Conductor-Based Policy Optimization in Multi-Agent Reinforcement Learning
AAAI 2026
Policy Newton Methods for Distortion Riskmetrics
AAAI 2026
T4NMTD: Transition-Centric Reinforcement Learning for Non-Markovian Task Decomposition
AAAI 2026
Stabilizing Policy Gradient Methods via Reward Profiling
AAAI 2026
Client Selection for Federated Policy Optimization with Environment Heterogeneity
JMLR 2025
Convergence and Sample Complexity of Natural Policy Gradient Primal-Dual Methods for Constrained MDPs
JMLR 2025
Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition
AAAI 2025
Score-Aware Policy-Gradient and Performance Guarantees using Local Lyapunov Stability
JMLR 2025
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
JMLR 2025
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
JMLR 2025
A Deployed Online Reinforcement Learning Algorithm in an Oral Health Clinical Trial
AAAI 2025
The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise
JMLR 2025
Simple Policy Optimization
ICML 2025
Continuously evolving rewards in an open-ended environment
JMLR 2025
Logarithmic Regret for Linear Markov Decision Processes with Adversarial Corruptions
AAAI 2025
Semi-Markovian Planning to Coordinate Aerial and Maritime Medical Evacuation Platforms
AAAI 2025
Online MDP with Prototypes Information: A Robust Adaptive Approach
AAAI 2025
Statistical field theory for Markov decision processes under uncertainty
JMLR 2025
On-Policy Algorithms for Continual Reinforcement Learning (Student Abstract)
AAAI 2025
ModelDiff: Symbolic Dynamic Programming for Model-Aware Policy Transfer in Deep Q-Learning
AAAI 2025
Defending Against Sophisticated Poisoning Attacks with RL-based Aggregation in Federated Learning
AAAI 2025
Leveraging Human Input to Enable Robust, Interactive, and Aligned AI Systems
AAAI 2025
Representation-driven Option Discovery in Reinforcement Learning
AAAI 2025
The POWER of Ikigai: Optimizing Life Fulfillment with an Integrated User Simulator and Adaptive Hobby Recommender
AAAI 2025
Formally Verified Approximate Policy Iteration
AAAI 2025
<
1
2
3
4
5
…
83
>