Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
Trajectory Tactics: When Transformers Learn Exploration to Generate Online Signature
WACV 2026
Hestia: Voxel-Face-Aware Hierarchical Next-Best-View Acquisition for Efficient 3D Reconstruction
WACV 2026
Reinforcement Learning-based Adaptive Control of Classifier-Free Guidance and Timestep Embeddings in Diffusion Models
WACV 2026
VCA: Video Curious Agent for Long Video Understanding
ICCV 2025
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
ICCV 2025
A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
ICCV 2025
Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences
ICCV 2025
World Models with Hints of Large Language Models for Goal Achieving
NAACL 2025
Visual-RFT: Visual Reinforcement Fine-Tuning
ICCV 2025
Do LLMs Need Inherent Reasoning Before Reinforcement Learning? A Study in Korean Self-Correction
AACL 2025
Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning
ICCV 2025
When2Call: When (not) to Call Tools
NAACL 2025
A Practical Analysis of Human Alignment with *PO
NAACL 2025
Finite Expression Method for Solving High-Dimensional Partial Differential Equations
JMLR 2025
Learned Perceptive Forward Dynamics Model for Safe and Platform-aware Robotic Navigation
RSS 2025
Token-Level Accept or Reject: A Micro Alignment Approach for Large Language Models
IJCAI 2025
MOOSS: Mask-Enhanced Temporal Contrastive Learning for Smooth State Evolution in Visual Reinforcement Learning
WACV 2025
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
WACV 2025
Dense Policy: Bidirectional Autoregressive Learning of Actions
ICCV 2025
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization
ICCV 2025
RMultiplex200K: Toward Reliable Multimodal Process Supervision for Visual Language Models on Telecommunications
ICCV 2025
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
ICCV 2025
MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time
NAACL 2025
Understanding Reference Policies in Direct Preference Optimization
NAACL 2025
GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching
AAAI 2025
<
1
2
3
4
5
…
51
>