← Learning Types

Machine Learning › Learning Types ›

Reinforcement Learning

2932 directly classified papers

Papers per year

Papers

MARS: Multimodal Adaptive Reasoning Model for Avoiding Overthinking AAAI 2026

LLM-CAS: Dynamic Neuron Perturbation for Real-Time Hallucination Correction AAAI 2026

AI-Salesman: Towards Reliable Large Language Model Driven Telemarketing AAAI 2026

Reasoning via Implicit Self-supervised Emergence for Instruction Segmentation AAAI 2026

QueryGym: Step-by-Step Interaction with Relational Databases AAAI 2026

FAST-EQA: Efficient Embodied Question Answering with Global and Local Region Relevancy WACV 2026

ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos WACV 2026

Reflect, Rewrite, Repeat: How Simple Arithmetic Enables Advanced Reasoning in Small Language Models EACL 2026

Think Just Enough: Leveraging Self-Assessed Confidence for Adaptive Reasoning in Language Models EACL 2026

ReaSon: Reinforced Causal Search with Information Bottleneck for Video Understanding AAAI 2026

Vision-G1: Towards General Reasoning Vision-Language Models via Reinforcement Learning AAAI 2026

USPR: Learning a Unified Solver for Profiled Routing AAAI 2026

KOALA: Knowledge of Optimization and Learning Algorithms for Healthcare AAAI 2026

When Eyes and Ears Disagree: Can MLLMs Discern Audio-Visual Confusion? AAAI 2026

LENS: Learning to Segment Anything with Unified Reinforced Reasoning AAAI 2026

Knowledge-Enhanced Image Captioning with Adaptive Graph-based Multimodal Alignment and LLM AAAI 2026

Think Wise, Collaborate Effectively: A Rationale-Aware LLM-Based Recommender with Reinforcement Learning from Collaborative Signals AAAI 2026

SHADOW: Dynamic-Aware Credit Assignment Against Long-Horizon Tasks AAAI 2026

Prototype Entropy Alignment: Reinforcing Structured Uncertainty in LLM Reasoning AAAI 2026

Toward Better EHR Reasoning in LLMs: Reinforcement Learning with Expert Attention Guidance AAAI 2026

Towards Better Correctness and Efficiency in Code Generation AAAI 2026

SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling AAAI 2026

Rectify Evaluation Preference: Improving LLMs’ Critique on Math Reasoning via Perplexity-aware Reinforcement Learning AAAI 2026

Decoupling Understanding from Reasoning via Problem Space Mapping for Small-Scale Model Reasoning AAAI 2026

ERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text Reranking AAAI 2026