UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning

Hwanhee Lee; Seunghyun Yoon; Franck Dernoncourt; Trung Bui; Kyomin Jung

2021 ACL ACL 2021

UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning

Abstract

AbstractDespite the success of various text generation metrics such as BERTScore, it is still difficult to evaluate the image captions without enough reference captions due to the diversity of the descriptions. In this paper, we introduce a new metric UMIC, an Unreferenced Metric for Image Captioning which does not require reference captions to evaluate image captions. Based on Vision-and-Language BERT, we train UMIC to discriminate negative captions via contrastive learning. Also, we observe critical problems of the previous benchmark dataset (i.e., human annotations) on image captioning metric, and introduce a new collection of human annotations on the generated captions. We validate UMIC on four datasets, including our new dataset, and show that UMIC has a higher correlation than all previous metrics that require multiple references.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — vision-and-language bert

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Hwanhee Lee , Seunghyun Yoon , Franck Dernoncourt , Trung Bui , Kyomin Jung

Topics

Machine Learning > Learning Types > Contrastive Learning Computer Vision > Generation > Image Captioning Deep Learning > Learning Types > Contrastive Learning

Keywords

contrastive learning image captioning evaluation metric unreferenced metric vision-and-language bert

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021