TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining

Qing Zong; Zhaowei Wang; Baixuan Xu; Tianshi Zheng; Haochen Shi; Weiqi Wang; Yangqiu Song; Ginny Wong; Simon See

2023 EMNLP EMNLP 2023

TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining

Abstract

AbstractA main goal of Argument Mining (AM) is to analyze an author’s stance. Unlike previous AM datasets focusing only on text, the shared task at the 10th Workshop on Argument Mining introduces a dataset including both texts and images. Importantly, these images contain both visual elements and optical characters. Our new framework, TILFA (A Unified Framework for Text, Image, and Layout Fusion in Argument Mining), is designed to handle this mixed data. It excels at not only understanding text but also detecting optical characters and recognizing layout details in images. Our model significantly outperforms existing baselines, earning our team, KnowComp, the 1st place in the leaderboard of Argumentative Stance Classification subtask in this shared task.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Qing Zong , Zhaowei Wang , Baixuan Xu , Tianshi Zheng , Haochen Shi , Weiqi Wang , Yangqiu Song , Ginny Wong , Simon See

Topics

Artificial Intelligence > Core AI > Multimodal Learning Artificial Intelligence > Learning Paradigms > Transfer Learning Deep Learning > Techniques > Attention Artificial Intelligence > Core AI > Multi-Modal Learning Natural Language Processing > Applications > Argument Mining

Keywords

argument mining multimodal learning optical character recognition layout understanding layout analysis image-text fusion text image fusion

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023