2024
SEMEVAL
SemEval 2024
SheffieldVeraAI at SemEval-2024 Task 4: Prompting and fine-tuning a Large Vision-Language Model for Binary Classification of Persuasion Techniques in Memes
Abstract
AbstractThis paper describes our approach for SemEval-2024 Task 4: Multilingual Detection of Persuasion Techniques in Memes. Specifically, we concentrate on Subtask 2b, a binary classification challenge that entails categorizing memes as either “propagandistic” or “non-propagandistic”. To address this task, we utilized the large multimodal pretrained model, LLaVa. We explored various prompting strategies and fine-tuning methods, and observed that the model, when not fine-tuned but provided with a few-shot learning examples, achieved the best performance. Additionally, we enhanced the model’s multilingual capabilities by integrating a machine translation model. Our system secured the 2nd place in the Arabic language category.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Computer Vision > Analysis > Object Detection
Natural Language Processing > Applications > Text Classification
Machine Learning > Learning Types > Few-Shot Learning
Artificial Intelligence > Core AI > Large Language Models
Deep Learning > Learning Types > Multi-Modal Learning