Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Policy Learning
2068 directly classified papers
Papers per year
2002: 6
2003: 1
2004: 1
2006: 11
2007: 10
2008: 14
2009: 9
2010: 23
2011: 15
2012: 25
2013: 25
2014: 24
2015: 23
2016: 27
2017: 61
2018: 107
2019: 187
2020: 216
2021: 274
2022: 259
2023: 321
2024: 247
2025: 153
2026: 29
Papers
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
NIPS 2024
Aligner²: Enhancing Joint Multiple Intent Detection and Slot Filling via Adjustive and Forced Cross-Task Alignment
AAAI 2024
Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables
AAAI 2024
A theoretical case-study of Scalable Oversight in Hierarchical Reinforcement Learning
NIPS 2024
TRIP NEGOTIATOR: A Travel Persona-aware Reinforced Dialogue Generation Model for Personalized Integrative Negotiation in Tourism
EMNLP 2024
Automated Multi-level Preference for MLLMs
NIPS 2024
Peer Learning: Learning Complex Policies in Groups from Scratch via Action Recommendations
AAAI 2024
Beyond Expected Return: Accounting for Policy Reproducibility When Evaluating Reinforcement Learning Algorithms
AAAI 2024
Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
NIPS 2024
DGPO: Discovering Multiple Strategies with Diversity-Guided Policy Optimization
AAAI 2024
Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning
NIPS 2024
Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees (Abstract Reprint)
AAAI 2024
Implicit Curriculum in Procgen Made Explicit
NIPS 2024
P2BPO: Permeable Penalty Barrier-Based Policy Optimization for Safe RL
AAAI 2024
Policy Optimization for Robust Average Reward MDPs
NIPS 2024
Regret Analysis of Policy Gradient Algorithm for Infinite Horizon Average Reward Markov Decision Processes
AAAI 2024
Enhancing Efficiency of Safe Reinforcement Learning via Sample Manipulation
NIPS 2024
Recurrent Reinforcement Learning with Memoroids
NIPS 2024
Imitate the Good and Avoid the Bad: An Incremental Approach to Safe Reinforcement Learning
AAAI 2024
A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning
AISTATS 2024
E2CL: Exploration-based Error Correction Learning for Embodied Agents
EMNLP 2024
Policy-shaped prediction: avoiding distractions in model-based reinforcement learning
NIPS 2024
Rewarding What Matters: Step-by-Step Reinforcement Learning for Task-Oriented Dialogue
EMNLP 2024
Robust Reinforcement Learning with General Utility
NIPS 2024
Near-Optimal Policy Optimization for Correlated Equilibrium in General-Sum Markov Games
AISTATS 2024
<
1
…
9
10
11
…
83
>