How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks

Zhiying Jiang; Raphael Tang; Ji Xin; Jimmy Lin

2021 EMNLP EMNLP 2021

How Does BERT Rerank Passages? An Attribution Analysis with Information Bottlenecks

Abstract

AbstractFine-tuned pre-trained transformers achieve the state of the art in passage reranking. Unfortunately, how they make their predictions remains vastly unexplained, especially at the end-to-end, input-to-output level. Little known is how tokens, layers, and passages precisely contribute to the final prediction. In this paper, we address this gap by leveraging the recently developed information bottlenecks for attribution (IBA) framework. On BERT-based models for passage reranking, we quantitatively demonstrate the framework’s veracity in extracting attribution maps, from which we perform detailed, token-wise analysis about how predictions are made. Overall, we find that BERT still cares about exact token matching for reranking; the [CLS] token mainly gathers information for predictions at the last layer; top-ranked passages are robust to token removal; and BERT fine-tuned on MSMARCO has positional bias towards the start of the passage.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — token analysis

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhiying Jiang , Raphael Tang , Ji Xin , Jimmy Lin

Topics

Artificial Intelligence > Core AI > Interpretability Deep Learning > Architectures > Transformers Natural Language Processing > Applications > Information Retrieval Artificial Intelligence > Core AI > Large Language Models Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Optimization & Theory > Evaluation

Keywords

information bottleneck bert model token analysis positional bia passage reranking fine-tuned transformer token matching attribution analysis token-wise analysis

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021