2019
ACL
ACL 2019
Probing Neural Network Comprehension of Natural Language Arguments
Abstract
AbstractWe are surprised to find that BERTβs peak performance of 77% on the Argument Reasoning Comprehension Task reaches just three points below the average untrained human baseline. However, we show that this result is entirely accounted for by exploitation of spurious statistical cues in the dataset. We analyze the nature of these cues and demonstrate that a range of models all exploit them. This analysis informs the construction of an adversarial dataset on which all models achieve random accuracy. Our adversarial dataset provides a more robust assessment of argument comprehension and should be adopted as the standard in future work.
π
Interdisciplinary Bridge
β Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
π§
Keyword Pioneer
β adversarial dataset
π£
Hot Topic Early Bird
β spurious correlation
π
Cross-Pollinator
β Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Core AI > Interpretability
Machine Learning > Learning Types > Adversarial Learning
Natural Language Processing > Applications > Text Classification
Natural Language Processing > Resources & Methods > Natural Language Inference
Natural Language Processing > Applications > Natural Language Inference
Deep Learning > Learning Types > Adversarial Learning