Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
Long-Term Safe Reinforcement Learning with Binary Feedback
AAAI 2024
Deep Reinforcement Learning for Communication Networks
AAAI 2024
ESRL: Efficient Sampling-Based Reinforcement Learning for Sequence Generation
AAAI 2024
Robustness Verification of Deep Reinforcement Learning Based Control Systems Using Reward Martingales
AAAI 2024
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning
AAAI 2024
Improving Discriminative Capability of Reward Models in RLHF Using Contrastive Learning
EMNLP 2024
Dialogue for Prompting: A Policy-Gradient-Based Discrete Prompt Generation for Few-Shot Learning
AAAI 2024
Learning Encodings for Constructive Neural Combinatorial Optimization Needs to Regret
AAAI 2024
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
NIPS 2024
Imitating Language via Scalable Inverse Reinforcement Learning
NIPS 2024
EgoGen: An Egocentric Synthetic Data Generator
CVPR 2024
ReCoRe: Regularized Contrastive Representation Learning of World Model
CVPR 2024
Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting
AAAI 2024
RL-SeqISP: Reinforcement Learning-Based Sequential Optimization for Image Signal Processing
AAAI 2024
What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction
AAAI 2024
Learning to Control Camera Exposure via Reinforcement Learning
CVPR 2024
World Models for General Surgical Grasping
RSS 2024
Adversarial Attacks on Federated-Learned Adaptive Bitrate Algorithms
AAAI 2024
BPO: Staying Close to the Behavior LLM Creates Better Online LLM Alignment
EMNLP 2024
Mitigating Open-Vocabulary Caption Hallucinations
EMNLP 2024
Rich Human Feedback for Text-to-Image Generation
CVPR 2024
Context-Aware Iteration Policy Network for Efficient Optical Flow Estimation
AAAI 2024
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search
NIPS 2024
Dynamic Multi-Reward Weighting for Multi-Style Controllable Generation
EMNLP 2024
Stress-Testing Capability Elicitation With Password-Locked Models
NIPS 2024
<
1
…
11
12
13
…
51
>