Dyve: Thinking Fast and Slow for Dynamic Process Verification

Jianyuan Zhong; Zeju Li; Zhijian Xu; Xiangyu Wen; Qiang Xu

2025 EMNLP EMNLP 2025

Dyve: Thinking Fast and Slow for Dynamic Process Verification

Abstract

AbstractLarge Language Models have advanced significantly in complex reasoning, often leveraging external reward model to improve the reliability of their multi-step processes. However, existing process verification methods struggle with reliably assessing incomplete reasoning traces and are limited by the cost of high-quality human annotations or the inherent noise in automatically generated labels. Therefore, we present Dyve, a dynamic process verifier that enhances reasoning error detection in large language models by integrating fast and slow thinking, inspired by Kahneman’s Systems Theory. Dyve adaptively applies immediate token-level confirmation (System 1) for straightforward steps and comprehensive analysis (System 2) for complex ones. Unlike traditional verifiers that only evaluate final outputs, Dyve employs a step-wise consensus-filtered supervision strategy, leveraging Monte Carlo estimation, LLM-as-a-Judge, and specialized reasoning models to extract high-quality training signals from noisy rollouts. Experimental results on ProcessBench and the MATH dataset confirm that Dyve significantly outperforms existing process-based verifiers and boosts performance in Best-of-N settings while maintaining computational efficiency by strategically allocating verification resources.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — step-wise consensus

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jianyuan Zhong , Zeju Li , Zhijian Xu , Xiangyu Wen , Qiang Xu

Topics

Artificial Intelligence > Core AI > Planning Machine Learning > Optimization & Theory > Stochastic Processes Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Core AI > Large Language Models Artificial Intelligence > Core AI > Reasoning Deep Learning > Learning Types > Reinforcement Learning

Keywords

monte carlo estimation reasoning verification reasoning error large language model process verification step-wise verification reasoning error detection step-wise consensus step-wise supervision

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025