Condenser: a Pre-training Architecture for Dense Retrieval

Luyu Gao; Jamie Callan

2021 EMNLP EMNLP 2021

Condenser: a Pre-training Architecture for Dense Retrieval

Abstract

AbstractPre-trained Transformer language models (LM) have become go-to text representation encoders. Prior research fine-tunes deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient text comparison and retrieval. However, dense encoders require a lot of data and sophisticated techniques to effectively train and suffer in low data situations. This paper finds a key reason is that standard LMs’ internal attention structure is not ready-to-use for dense encoders, which needs to aggregate text information into the dense representation. We propose to pre-train towards dense encoder with a novel Transformer architecture, Condenser, where LM prediction CONditions on DENSE Representation. Our experiments show Condenser improves over standard LM by large margins on various text retrieval and similarity tasks.

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Luyu Gao , Jamie Callan

Topics

Deep Learning > Architectures > Transformers Deep Learning > Techniques > Pretraining Natural Language Processing > Applications > Information Retrieval Natural Language Processing > Resources & Methods > Large Language Models Computer Science > Applications > Information Retrieval Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Learning Types > Representation Learning Machine Learning > Application Areas > Information Retrieval

Keywords

transformer architecture transfer learning information retrieval text representation dense retrieval pre-trained language model pre-trained transformer encoder architecture

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021