Beyond Sampling: Self-Sorting for Long-Context Ranking

Juseon Do; Sungwoo Han; Jingun Kwon; Hidetaka Kamigaito; Katsuhiko Hayashi; Taro Watanabe

2026 EACL EACL 2026

Beyond Sampling: Self-Sorting for Long-Context Ranking

Abstract

AbstractRanking is a fundamental component in a wide range of AI applications. However, large language models (LLMs) remain unstable on long-context ranking. Sliding-window processing is costly and listwise prompting over full candidates still yields inconsistent orders. We show that sampling alone, even with selection-based methods, cannot stabilize ranking because LLM consistency decomposes into within-list order and cross-list preference, in which a single stochastic process cannot align. To address this, we introduce Self-Sorting (SS), which generates m candidate lists and performs n selection-time re-rankings over those lists. SS fuses explicit within-list positions with implicit cross-list preferences to score entities and return a top-k set. Experimental results on five widely used ranking benchmarks show significant improvements in nDCG@1,5,10, highlighting the critical role of implicit consistency.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — long context ranking

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Juseon Do , Sungwoo Han , Jingun Kwon , Hidetaka Kamigaito , Katsuhiko Hayashi , Taro Watanabe

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Applications > Information Retrieval

Keywords

information retrieval large language model long context ranking

Download PDF

Related papers

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health 2026

A Benchmark for Audio Reasoning Capabilities of Multimodal Large Language Models 2026

InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection 2026

Generative Personality Simulation via Theory-Informed Structured Interview 2026

Word Surprisal Correlates with Sentential Contradiction in LLMs 2026