LimRank: Less is More for Reasoning-Intensive Information Reranking

Tingyu Song; Yilun Zhao; Siyue Zhang; Chen Zhao; Arman Cohan

2025 EMNLP EMNLP 2025

LimRank: Less is More for Reasoning-Intensive Information Reranking

Abstract

AbstractExisting approaches typically rely on large-scale fine-tuning to adapt LLMs for information reranking tasks, which is computationally expensive. In this work, we demonstrate that modern LLMs can be effectively adapted using only minimal, high-quality supervision. To enable this, we design LIMRANK-SYNTHESIZER, a reusable and open-source pipeline for generating diverse, challenging, and realistic reranking examples. Using this synthetic data, we fine-tune our reranker model, LIMRANK. We evaluate LIMRANK on two challenging benchmarks, i.e., BRIGHT for reasoning-intensive retrieval and FollowIR for instruction-following retrieval. Our experiments demonstrate that LIMRANK achieves competitive performance, while being trained on less than 5% of the data typically used in prior work. Further ablation studies demonstrate the effectiveness of LIMRANK-SYNTHESIZER and the strong generalization capabilities of LIMRANK across downstream tasks, including scientific literature search and retrieval-augmented generation for knowledge-intensive problem solving.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Science and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — information reranking

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tingyu Song , Yilun Zhao , Siyue Zhang , Chen Zhao , Arman Cohan

Topics

Artificial Intelligence > Core AI > Foundation Models Machine Learning > Learning Types > Weakly Supervised Learning Natural Language Processing > Applications > Information Retrieval Computer Science > Applications > Information Retrieval Artificial Intelligence > Core AI > Large Language Models Machine Learning > Application Areas > Transfer Learning Machine Learning > Learning Types > Retrieval-Augmented Generation Deep Learning > Learning Types > Fine-Tuning

Keywords

instruction following synthetic data generation synthetic datum retrieval-augmented generation retrieval augmentation large language model information reranking

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025