2025
ACL
ACL 2025
FJWU_Squad at SemEval-2025 Task 1: An Idiom Visual Understanding Dataset for Idiom Learning
Abstract
AbstractIdiomatic expressions pose difficulties for Natural Language Processing (NLP) because they are noncompositional. In this paper, we propose the Idiom Visual Understanding Dataset (IVUD), a multimodal dataset for idiom understanding using visual and textual representation. For SemEval-2025 Task 1 (AdMIRe), we specifically addressed dataset augmentation using AI-synthesized images and human-directed prompt engineering. We compared the efficacy of vision- and text-based models in ranking images aligned with idiomatic phrases. The results identify the advantages of using multimodal context for enhanced idiom understanding, showcasing how vision-language models perform better than text-only approaches in the detection of idiomaticity.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Machine Learning > Application Areas > Data Augmentation
Natural Language Processing > Understanding > Semantic Analysis
Natural Language Processing > Resources & Methods > Large Language Models
Natural Language Processing > Resources & Methods > Multimodal NLP
Deep Learning > Learning Types > Multi-Modal Learning
Deep Learning > Models > Vision-Language Models
Natural Language Processing > Applications > Multimodal NLP