2023
ACL
ACL 2023
Medical Visual Textual Entailment for Numerical Understanding of Vision-and-Language Models
Abstract
AbstractAssessing the capacity of numerical understanding of vision-and-language models over images and texts is crucial for real vision-and-language applications, such as systems for automated medical image analysis. We provide a visual reasoning dataset focusing on numerical understanding in the medical domain. The experiments using our dataset show that current vision-and-language models fail to perform numerical inference in the medical domain. However, the data augmentation with only a small amount of our dataset improves the model performance, while maintaining the performance in the general domain.
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Computer Vision and Deep Learning and Healthcare & Medicine
🐣
Hot Topic Early Bird
— medical domain
Authors
Topics
Artificial Intelligence > Core AI > Multimodal Learning
Artificial Intelligence > Learning Paradigms > Transfer Learning
Computer Vision > Domain-Specific > Medical Imaging
Healthcare & Medicine > Clinical > Medical Imaging
Deep Learning > Models > Large Language Models
Computer Vision > Core AI > Multimodal Learning
Deep Learning > Models > Vision-Language Models