A Robust Dual-debiasing VQA Model based on Counterfactual Causal Effect

Lingyun Song; Chengkun Yang; Xuanyu Li; Xuequn Shang

2024 EMNLP EMNLP 2024

A Robust Dual-debiasing VQA Model based on Counterfactual Causal Effect

Abstract

AbstractTraditional VQA models are inherently vulnerable to language bias, resulting in a significant performance drop when encountering out-of-distribution datasets. The conventional VQA models suffer from language bias that indicates a spurious correlation between textual questions and answers. Given the outstanding effectiveness of counterfactual causal inference in eliminating bias, we propose a model agnostic dual-debiasing framework based on Counterfactual Causal Effect (DCCE), which explicitly models two types of language bias(i.e., shortcut and distribution bias) by separate branches under the counterfactual inference framework. The effects of both types ofbias on answer prediction can be effectively mitigated by subtracting direct effect of textual questions on answers from total effect ofvisual questions on answers. Experimental results demonstrate that our proposed DCCE framework significantly reduces language biasand achieves state-of-the-art performance on the benchmark datasets without requiring additional augmented data. Our code is available inhttps://github.com/sxycyck/dcce.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision and Deep Learning and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Lingyun Song , Chengkun Yang , Xuanyu Li , Xuequn Shang

Topics

Artificial Intelligence > Core AI > Causal Inference Machine Learning > Application Areas > Domain Generalization Deep Learning > Learning Types > Adversarial Learning Computer Vision > Applications > Visual Question Answering

Keywords

causal inference visual question answering out-of-distribution generalization out-of-distribution detection counterfactual reasoning language bia counterfactual causal effect

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024