2025 NAACL NAACL 2025

Query-focused Referentiability Learning for Zero-shot Retrieval

Abstract

AbstractDense passage retrieval enhances Information Retrieval (IR) by encoding queries and passages into representation space. However, passage representations often fail to be referenced by their gold queries under domain shifts, revealing a weakness in representation space. One desirable concept for representations is ”argmaxable”. Being argmaxable ensures that no representations are theoretically excluded from selection due to geometric constraints. To be argmaxable, a notable approach is to increase isotropy, where representations are evenly spread out in all directions. These findings, while desirable also for IR, focus on passage representation and not on query, making it challenging to directly apply their findings to IR. In contrast, we introduce a novel query-focused concept of ”referentiable” tailored for IR tasks, which ensures that passage representations are referenced by their gold queries. Building on this, we propose Learning Referentiable Representation (LRR), and two strategic metrics, Self-P and Self-Q, quantifying how the representations are referentiable. Our experiments compare three dense model versions: Naive, Isotropic, and Referentiable, demonstrating that LRR leads to enhanced zero-shot performance, surpassing existing naive and isotropic versions.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio