Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval

Zeynep Akkalyoncu Yilmaz; Wei Yang; Haotian Zhang; Jimmy Lin

2019 EMNLP EMNLP 2019

Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval

Abstract

AbstractThis paper applies BERT to ad hoc document retrieval on news articles, which requires addressing two challenges: relevance judgments in existing test collections are typically provided only at the document level, and documents often exceed the length that BERT was designed to handle. Our solution is to aggregate sentence-level evidence to rank documents. Furthermore, we are able to leverage passage-level relevance judgments fortuitously available in other domains to fine-tune BERT models that are able to capture cross-domain notions of relevance, and can be directly used for ranking news articles. Our simple neural ranking models achieve state-of-the-art effectiveness on three standard test collections.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — sentence-level evidence

🐣 Hot Topic Early Bird — document retrieval

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning

Authors

Zeynep Akkalyoncu Yilmaz , Wei Yang , Haotian Zhang , Jimmy Lin

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Core Methods > Embedding Learning Machine Learning > Application Areas > Domain Adaptation Natural Language Processing > Applications > Information Retrieval Deep Learning > Models > Transformers Deep Learning > Learning Types > Transfer Learning

Keywords

document retrieval neural ranking neural ranking model sentence-level evidence passage-level relevance

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019