Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
ICCV 2025
A Survey of Post-Training Scaling in Large Language Models
ACL 2025
CARMO: Dynamic Criteria Generation for Context Aware Reward Modelling
ACL 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
ICCV 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions
ICCV 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
ICCV 2025
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
ICCV 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
ICCV 2025
Comparing Bad Apples to Good Oranges Aligning Large Language Models via Joint Preference Optimization
ACL 2025
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
ICCV 2025
Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning
ICCV 2025
Dialogue Systems for Emotional Support via Value Reinforcement
ACL 2025
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond
ACL 2025
VCA: Video Curious Agent for Long Video Understanding
ICCV 2025
R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory
ACL 2025
Overview of the BioLaySumm 2025 Shared Task on Lay Summarization of Biomedical Research Articles and Radiology Reports
ACL 2025
Dynamic Collaboration of Multi-Language Models based on Minimal Complete Semantic Units
EMNLP 2025
Multi-Edge Reinforced Collaborative Data Acquisition for Continuous Video Analytics by Prioritizing Quality over Quantity
AAAI 2025
Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch
ACL 2025
Sparse Rewards Can Self-Train Dialogue Agents
ACL 2025
Removing Prompt-template Bias in Reinforcement Learning from Human Feedback
ACL 2025
DiaLLMs: EHR-Enhanced Clinical Conversational System for Clinical Test Recommendation and Diagnosis Prediction
ACL 2025
LookAlike: Consistent Distractor Generation in Math MCQs
ACL 2025
Learned Perceptive Forward Dynamics Model for Safe and Platform-aware Robotic Navigation
RSS 2025
YNU-HPCC at SemEval-2025 Task 2: Local Cache and Online Retrieval-Based method for Entity-Aware Machine Translation
ACL 2025
<
1
…
7
8
9
…
51
>