A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers

Roxana Petcu; Samarth Bhargav; Maarten de Rijke; Evangelos Kanoulas

2025 EMNLP EMNLP 2025

A Comprehensive Taxonomy of Negation for NLP and Neural Retrievers

Abstract

AbstractUnderstanding and solving complex reasoning tasks is vital for addressing the information needs of a user. Although dense neural models learn contextualised embeddings, they underperform on queries containing negation. To understand this phenomenon, we study negation in traditional neural information retrieval and LLM-based models. We (1) introduce a taxonomy of negation that derives from philosophical, linguistic, and logical definitions; (2) generate two benchmark datasets that can be used to evaluate the performance of neural information retrieval models and to fine-tune models for a more robust performance on negation; and (3) propose a logic-based classification mechanism that can be used to analyze the performance of retrieval models on existing datasets. Our taxonomy produces a balanced data distribution over negation types, providing a better training setup that leads to faster convergence on the NevIR dataset. Moreover, we propose a classification schema that reveals the coverage of negation types in existing datasets, offering insights into the factors that might affect the generalization of fine-tuned models on negation. Our code is publicly available on GitHub, and the datasets are available on HuggingFace.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — logical classification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Roxana Petcu , Samarth Bhargav , Maarten de Rijke , Evangelos Kanoulas

Topics

Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Applications > Information Retrieval Machine Learning > Learning Types > Classification Natural Language Processing > Understanding > Natural Language Inference

Keywords

information retrieval semantic analysis benchmark dataset neural information retrieval negation handling neural retriever logical classification

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025