2018
NAACL
NAACL 2018
Cross-Lingual Learning-to-Rank with Shared Representations
Abstract
AbstractCross-lingual information retrieval (CLIR) is a document retrieval task where the documents are written in a language different from that of the userโs query. This is a challenging problem for data-driven approaches due to the general lack of labeled training data. We introduce a large-scale dataset derived from Wikipedia to support CLIR research in 25 languages. Further, we present a simple yet effective neural learning-to-rank model that shares representations across languages and reduces the data requirement. This model can exploit training data in, for example, Japanese-English CLIR to improve the results of Swahili-English CLIR.
๐
Interdisciplinary Bridge
โ Artificial Intelligence and Natural Language Processing
๐ฃ
Hot Topic Early Bird
โ multilingual retrieval
๐
Cross-Pollinator
โ Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio