Embracing Ambiguity: Shifting the Training Target of NLI Models

Johannes Mario Meissner; Napat Thumwanit; Saku Sugawara; Akiko Aizawa

2021 ACL ACL 2021

Embracing Ambiguity: Shifting the Training Target of NLI Models

Abstract

AbstractNatural Language Inference (NLI) datasets contain examples with highly ambiguous labels. While many research works do not pay much attention to this fact, several recent efforts have been made to acknowledge and embrace the existence of ambiguity, such as UNLI and ChaosNLI. In this paper, we explore the option of training directly on the estimated label distribution of the annotators in the NLI task, using a learning loss based on this ambiguity distribution instead of the gold-labels. We prepare AmbiNLI, a trial dataset obtained from readily available sources, and show it is possible to reduce ChaosNLI divergence scores when finetuning on this data, a promising first step towards learning how to capture linguistic ambiguity. Additionally, we show that training on the same amount of data but targeting the ambiguity distribution instead of gold-labels can result in models that achieve higher performance and learn better representations for downstream tasks.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — ambiguous annotation

🐣 Hot Topic Early Bird — annotator disagreement

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Johannes Mario Meissner , Napat Thumwanit , Saku Sugawara , Akiko Aizawa

Topics

Machine Learning > Learning Types > Weakly Supervised Learning Natural Language Processing > Applications > Natural Language Inference Machine Learning > Learning Types > Classification Natural Language Processing > Understanding > Natural Language Inference Artificial Intelligence > Core AI > Natural Language Inference

Keywords

representation learning natural language inference label distribution label ambiguity annotator disagreement ambiguous annotation learning loss gold-label classification

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021