Artificial Intelligence › Core AI ›

Reinforcement Learning

767 directly classified papers

Papers per year

Papers

Reward Modeling Requires Automatic Adjustment Based on Data Quality EMNLP 2024

Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning EMNLP 2024

Focus-Then-Decide: Segmentation-Assisted Reinforcement Learning AAAI 2024

ToolPlanner: A Tool Augmented LLM for Multi Granularity Instructions with Path Planning and Feedback EMNLP 2024

Human-Guided Moral Decision Making in Text-Based Games AAAI 2024

Safeguarded Progress in Reinforcement Learning: Safe Bayesian Exploration for Control Policy Synthesis AAAI 2024

Towards Pareto-Efficient RLHF: Paying Attention to a Few High-Reward Samples with Reward Dropout EMNLP 2024

Long-Term Safe Reinforcement Learning with Binary Feedback AAAI 2024

Enhancing Off-Policy Constrained Reinforcement Learning through Adaptive Ensemble C Estimation AAAI 2024

Active Reinforcement Learning for Robust Building Control AAAI 2024

Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models EMNLP 2024

Transformers Learn Transition Dynamics when Trained to Predict Markov Decision Processes EMNLP 2024

What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction AAAI 2024

Reward-Respecting Subtasks for Model-Based Reinforcement Learning (Abstract Reprint) AAAI 2024

Direct Multi-Turn Preference Optimization for Language Agents EMNLP 2024

P2BPO: Permeable Penalty Barrier-Based Policy Optimization for Safe RL AAAI 2024

The Accuracy Paradox in RLHF: When Better Reward Models Don’t Yield Better Language Models EMNLP 2024

Learning Generalizable and Composable Abstractions for Transfer in Reinforcement Learning AAAI 2024

Learning to Control Camera Exposure via Reinforcement Learning CVPR 2024

Learning to Select Views for Efficient Multi-View Understanding CVPR 2024

AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning CVPR 2024

Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement Learning AAAI 2024

Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation AAAI 2024

Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting AAAI 2024

Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement CVPR 2024