DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task

Sami Haq; Rudali Huidrom; Sheila Castilho

2024 EMNLP EMNLP 2024

DCU ADAPT at WMT24: English to Low-resource Multi-Modal Translation Task

Abstract

AbstractThis paper presents the system description of “DCU_NMT’s” submission to the WMT-WAT24 English-to-Low-Resource Multimodal Translation Task. We participated in the English-to-Hindi track, developing both text-only and multimodal neural machine translation (NMT) systems. The text-only systems were trained from scratch on constrained data and augmented with back-translated data. For the multimodal approach, we implemented a context-aware transformer model that integrates visual features as additional contextual information. Specifically, image descriptions generated by an image captioning model were encoded using BERT and concatenated with the textual input.The results indicate that our multimodal system, trained solely on limited data, showed improvements over the text-only baseline in both the challenge and evaluation sets, suggesting the potential benefits of incorporating visual information.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sami Haq , Rudali Huidrom , Sheila Castilho

Topics

Computer Vision > Generation > Image Captioning Natural Language Processing > Applications > Machine Translation Natural Language Processing > Generation > Machine Translation Deep Learning > Models > Transformers Deep Learning > Learning Types > Multi-Modal Learning

Keywords

neural machine translation multimodal learning image captioning low-resource language visual feature multimodal translation context-aware transformer

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024