Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
Multi-Edge Reinforced Collaborative Data Acquisition for Continuous Video Analytics by Prioritizing Quality over Quantity
AAAI 2025
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
AAAI 2025
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond
ACL 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
ICCV 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions
ICCV 2025
World Models with Hints of Large Language Models for Goal Achieving
NAACL 2025
When2Call: When (not) to Call Tools
NAACL 2025
MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time
NAACL 2025
A Practical Analysis of Human Alignment with *PO
NAACL 2025
Understanding Reference Policies in Direct Preference Optimization
NAACL 2025
Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction
AACL 2025
RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors
AAAI 2025
Teaching Models to Improve on Tape
AAAI 2025
Neural Combinatorial Clustered Bandits for Recommendation Systems
AAAI 2025
Sparse Rewards Can Self-Train Dialogue Agents
ACL 2025
Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning
ICCV 2025
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
ICCV 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
ICCV 2025
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
ICCV 2025
The Distributional Reward Critic Framework for Reinforcement Learning Under Perturbed Rewards
AAAI 2025
Dialogue Systems for Emotional Support via Value Reinforcement
ACL 2025
Towards Efficient Collaboration via Graph Modeling in Reinforcement Learning
AAAI 2025
RLHF Algorithms Ranked: An Extensive Evaluation Across Diverse Tasks, Rewards, and Hyperparameters
EMNLP 2025
Learned Perceptive Forward Dynamics Model for Safe and Platform-aware Robotic Navigation
RSS 2025
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments
CVPR 2025
<
1
2
3
4
5
…
51
>