Research Explorer
Papers
Conferences
Authors
Topics
Keywords
Trends
Achievements
Explore
← Learning Types
Deep Learning
›
Learning Types
›
Reinforcement Learning
1263 directly classified papers
Papers per year
2006: 1
2007: 2
2008: 3
2009: 2
2010: 1
2011: 2
2012: 3
2013: 2
2014: 3
2015: 2
2016: 8
2017: 44
2018: 95
2019: 134
2020: 123
2021: 131
2022: 143
2023: 127
2024: 194
2025: 240
2026: 3
Papers
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
ACL 2024
Training Language Models to Generate Text with Citations via Fine-grained Rewards
ACL 2024
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
ACL 2024
Inverse-Q*: Token Level Reinforcement Learning for Aligning Large Language Models Without Preference Data
EMNLP 2024
Applying RLAIF for Code Generation with API-usage in Lightweight LLMs
ACL 2024
Fast and Knowledge-Free Deep Learning for General Game Playing (Student Abstract)
AAAI 2024
Deep Reinforcement Learning for Communication Networks
AAAI 2024
Optimizing Language Models with Fair and Stable Reward Composition in Reinforcement Learning
EMNLP 2024
Harnessing Network Effect for Fake News Mitigation: Selecting Debunkers via Self-Imitation Learning
AAAI 2024
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
ACL 2024
Pure-Past Action Masking
AAAI 2024
Long-Term Safe Reinforcement Learning with Binary Feedback
AAAI 2024
MACAROON: Training Vision-Language Models To Be Your Engaged Partners
EMNLP 2024
Enhancing Alignment using Curriculum Learning & Ranked Preferences
EMNLP 2024
I Open at the Close: A Deep Reinforcement Learning Evaluation of Open Streets Initiatives
AAAI 2024
Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs
ACL 2024
Frugal LMs Trained to Invoke Symbolic Solvers Achieve Parameter-Efficient Arithmetic Reasoning
AAAI 2024
Carbon Footprint Reduction for Sustainable Data Centers in Real-Time
AAAI 2024
Dialogue for Prompting: A Policy-Gradient-Based Discrete Prompt Generation for Few-Shot Learning
AAAI 2024
Reward Certification for Policy Smoothed Reinforcement Learning
AAAI 2024
Two-Stage Evolutionary Reinforcement Learning for Enhancing Exploration and Exploitation
AAAI 2024
Rich Human Feedback for Text-to-Image Generation
CVPR 2024
Robust Communicative Multi-Agent Reinforcement Learning with Active Defense
AAAI 2024
Learn to Follow: Decentralized Lifelong Multi-Agent Pathfinding via Planning and Learning
AAAI 2024
What Effects the Generalization in Visual Reinforcement Learning: Policy Consistency with Truncated Return Prediction
AAAI 2024
<
1
…
12
13
14
…
51
>