2021
EMNLP
EMNLP 2021
Cross-Lingual Training of Dense Retrievers for Document Retrieval
Abstract
AbstractDense retrieval has shown great success for passage ranking in English. However, its effectiveness for non-English languages remains unexplored due to limitation in training resources. In this work, we explore different transfer techniques for document ranking from English annotations to non-English languages. Our experiments reveal that zero-shot model-based transfer using mBERT improves search quality. We find that weakly-supervised target language transfer is competitive compared to generation-based target language transfer, which requires translation models.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Science and Deep Learning and Machine Learning and Natural Language Processing
🐣
Hot Topic Early Bird
— document retrieval
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Topics
Artificial Intelligence > Learning Paradigms > Transfer Learning
Machine Learning > Application Areas > Domain Adaptation
Natural Language Processing > Applications > Information Retrieval
Natural Language Processing > Resources & Methods > Multilingual NLP
Computer Science > Applications > Information Retrieval
Machine Learning > Learning Types > Transfer Learning
Artificial Intelligence > Core AI > Information Retrieval
Deep Learning > Learning Types > Retrieval-Augmented Generation