Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Methods
Reinforcement Learning
›
Methods
›
Deep RL
3861 directly classified papers
Papers per year
2005: 1
2006: 9
2007: 14
2008: 15
2009: 9
2010: 21
2011: 27
2012: 32
2013: 21
2014: 17
2015: 10
2016: 33
2017: 102
2018: 222
2019: 399
2020: 450
2021: 533
2022: 478
2023: 532
2024: 513
2025: 326
2026: 97
Papers
Efficient Integration of External Knowledge to LLM-based World Models via Retrieval-Augmented Generation and Reinforcement Learning
EMNLP 2025
AgentThink: A Unified Framework for Tool-Augmented Chain-of-Thought Reasoning in Vision-Language Models for Autonomous Driving
EMNLP 2025
CoTD-PO: Chain-of-Thought Distillation with Preference Optimization
EMNLP 2025
Vid2Sim: Realistic and Interactive Simulation from Video for Urban Navigation
CVPR 2025
RAVEN++: Pinpointing Fine-Grained Violations in Advertisement Videos with Active Reinforcement Reasoning
EMNLP 2025
Reasoning in Visual Navigation of End-to-end Trained Agents: A Dynamical Systems Approach
CVPR 2025
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
CVPR 2025
Plug-and-Play PPO: An Adaptive Point Prompt Optimizer Making SAM Greater
CVPR 2025
Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning
CVPR 2025
AdaptiveAE: An Adaptive Exposure Strategy for HDR Capturing in Dynamic Scenes
ICCV 2025
ERFSL: An Efficient Reward Function Searcher via Large Language Models for Custom-Environment Multi-Objective Reinforcement Learning (Student Abstract)
AAAI 2025
RL-RC-DoT: A Block-level RL agent for Task-Aware Video Compression
CVPR 2025
RaSS: Improving Denoising Diffusion Samplers with Reinforced Active Sampling Scheduler
CVPR 2025
Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning
CVPR 2025
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
ICCV 2025
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
ICCV 2025
VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization
ICCV 2025
DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data
EMNLP 2025
Q-PRM: Adaptive Query Rewriting for Retrieval-Augmented Generation via Step-level Process Supervision
EMNLP 2025
Legal Mathematical Reasoning with LLMs: Procedural Alignment through Two-Stage Reinforcement Learning
EMNLP 2025
Speaking at the Right Level: Literacy-Controlled Counterspeech Generation with RAG-RL
EMNLP 2025
Reward Mixology: Crafting Hybrid Signals for Reinforcement Learning Driven In-Context Learning
EMNLP 2025
REAR: Reinforced Reasoning Optimization for Event Argument Extraction with Relation-Aware Support
EMNLP 2025
Token-level Proximal Policy Optimization for Query Generation
EMNLP 2025
Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning
EMNLP 2025
<
1
…
4
5
6
…
155
>