2020
ACL
ACL 2020
Multilingual Universal Sentence Encoder for Semantic Retrieval
Abstract
AbstractWe present easy-to-use retrieval focused multilingual sentence embedding models, made available on TensorFlow Hub. The models embed text from 16 languages into a shared semantic space using a multi-task trained dual-encoder that learns tied cross-lingual representations via translation bridge tasks (Chidambaram et al., 2018). The models achieve a new state-of-the-art in performance on monolingual and cross-lingual semantic retrieval (SR). Competitive performance is obtained on the related tasks of translation pair bitext retrieval (BR) and retrieval question answering (ReQA). On transfer learning tasks, our multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embeddings.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— multilingual sentence embedding
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
🐣
Hot Topic Early Bird
— semantic retrieval
Authors
Topics
Machine Learning > Core Methods > Embedding Learning
Natural Language Processing > Applications > Information Retrieval
Natural Language Processing > Resources & Methods > Multilingual NLP
Artificial Intelligence > Core AI > Information Retrieval
Artificial Intelligence > Core AI > Multi-Modal Learning