Testing the Generalization Power of Neural Network Models across NLI Benchmarks

Aarne Talman; Stergios Chatzikyriakidis

2019 ACL ACL 2019

Testing the Generalization Power of Neural Network Models across NLI Benchmarks

Abstract

AbstractNeural network models have been very successful in natural language inference, with the best models reaching 90% accuracy in some benchmarks. However, the success of these models turns out to be largely benchmark specific. We show that models trained on a natural language inference dataset drawn from one benchmark fail to perform well in others, even if the notion of inference assumed in these benchmarks is the same or similar. We train six high performing neural network models on different datasets and show that each one of these has problems of generalizing when we replace the original test set with a test set taken from another corpus designed for the same task. In light of these results, we argue that most of the current neural network models are not able to generalize well in the task of natural language inference. We find that using large pre-trained language models helps with transfer learning when the datasets are similar enough. Our results also highlight that the current NLI datasets do not cover the different nuances of inference extensively enough.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

📈 Trend Setter — Natural Language Inference

🐣 Hot Topic Early Bird — pre-trained language model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

Authors

Aarne Talman , Stergios Chatzikyriakidis

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Natural Language Processing > Resources & Methods > Natural Language Inference Machine Learning > Learning Types > Transfer Learning Natural Language Processing > Applications > Natural Language Inference Deep Learning > Learning Types > Transfer Learning Machine Learning > Learning Types > Generalization

Keywords

benchmark evaluation transfer learning natural language inference pre-trained language model neural network

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019