2019
EMNLP
EMNLP 2019
Cross-Domain Modeling of Sentence-Level Evidence for Document Retrieval
Abstract
AbstractThis paper applies BERT to ad hoc document retrieval on news articles, which requires addressing two challenges: relevance judgments in existing test collections are typically provided only at the document level, and documents often exceed the length that BERT was designed to handle. Our solution is to aggregate sentence-level evidence to rank documents. Furthermore, we are able to leverage passage-level relevance judgments fortuitously available in other domains to fine-tune BERT models that are able to capture cross-domain notions of relevance, and can be directly used for ranking news articles. Our simple neural ranking models achieve state-of-the-art effectiveness on three standard test collections.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— sentence-level evidence
🐣
Hot Topic Early Bird
— document retrieval
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Machine Learning > Core Methods > Embedding Learning
Machine Learning > Application Areas > Domain Adaptation
Natural Language Processing > Applications > Information Retrieval
Deep Learning > Models > Transformers
Deep Learning > Learning Types > Transfer Learning