Cross-Lingual Training of Neural Models for Document Ranking

Peng Shi; He Bai; Jimmy Lin

2020 EMNLP EMNLP 2020

Cross-Lingual Training of Neural Models for Document Ranking

Abstract

AbstractWe tackle the challenge of cross-lingual training of neural document ranking models for mono-lingual retrieval, specifically leveraging relevance judgments in English to improve search in non-English languages. Our work successfully applies multi-lingual BERT (mBERT) to document ranking and additionally compares against a number of alternatives: translating the training data, translating documents, multi-stage hybrids, and ensembles. Experiments on test collections in six different languages from diverse language families reveal many interesting findings: model-based relevance transfer using mBERT can significantly improve search quality in (non-English) mono-lingual retrieval, but other “low resource” approaches are competitive as well.

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Machine Learning

🧭 Keyword Pioneer — relevance transfer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Peng Shi , He Bai , Jimmy Lin

Topics

Machine Learning > Application Areas > Domain Adaptation Computer Science > Applications > Information Retrieval Machine Learning > Learning Types > Transfer Learning Deep Learning > Learning Types > Multi-Lingual Learning

Keywords

cross-lingual information retrieval document ranking neural model multilingual bert relevance judgment neural ranking model relevance transfer

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020