DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments

Yuxiang Zheng; Dayuan Fu; Xiangkun Hu; Xiaojie Cai; Lyumanshan Ye; Pengrui Lu; Pengfei Liu

2025 EMNLP EMNLP 2025

DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments

Abstract

AbstractLarge Language Models (LLMs) with web search capabilities show significant potential for deep research, yet current methods—brittle prompt engineering or RAG-based reinforcement learning in controlled environments—fail to capture real-world complexities. In this paper, we introduce DeepResearcher, the first comprehensive framework for end-to-end training of LLM-based deep research agents through scaling reinforcement learning (RL) in real-world environments with authentic web search interactions. Unlike RAG approaches reliant on fixed corpora, DeepResearcher trains agents to navigate the noisy, dynamic open web. We implement a specialized multi-agent architecture where browsing agents extract relevant information from various webpage structures and overcoming significant technical challenges. Extensive experiments on open-domain research tasks demonstrate that DeepResearcher achieves substantial improvements of up to 28.9 points over prompt engineering-based baselines and up to 7.2 points over RAG-based RL agents. Our qualitative analysis reveals emergent cognitive behaviors from end-to-end RL training, such as planning, cross-validation, self-reflection for research redirection, and maintain honesty when unable to find definitive answers. Our results highlight that end-to-end training in real-world web environments is fundamental for developing robust research capabilities aligned with real-world applications. The source codefor DeepResearcher is released at: https://github.com/GAIR-NLP/DeepResearcher.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🧭 Keyword Pioneer — research agent

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yuxiang Zheng , Dayuan Fu , Xiangkun Hu , Xiaojie Cai , Lyumanshan Ye , Pengrui Lu , Pengfei Liu

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Planning Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Core AI > Information Retrieval

Keywords

reinforcement learning web search language model agent system large language model research automation research agent autonomous research

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025