Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs

Alex Warstadt; Yu Cao; Ioana Grosu; Wei Peng; Hagen Blix; Yining Nie; Anna Alsop; Shikha Bordia; Haokun Liu; Alicia Parrish; Sheng-Fu Wang; Jason Phang; Anhad Mohananey; Phu Mon Htut; Paloma Jeretic; Samuel R. Bowman

2019 EMNLP EMNLP 2019

Investigating BERT’s Knowledge of Language: Five Analysis Methods with NPIs

Abstract

AbstractThough state-of-the-art sentence representation models can perform tasks requiring significant knowledge of grammar, it is an open question how best to evaluate their grammatical knowledge. We explore five experimental methods inspired by prior work evaluating pretrained sentence representation models. We use a single linguistic phenomenon, negative polarity item (NPI) licensing, as a case study for our experiments. NPIs like any are grammatical only if they appear in a licensing environment like negation (Sue doesn’t have any cats vs. *Sue has any cats). This phenomenon is challenging because of the variety of NPI licensing environments that exist. We introduce an artificially generated dataset that manipulates key features of NPI licensing for the experiments. We find that BERT has significant knowledge of these features, but its success varies widely across different experimental methods. We conclude that a variety of methods is necessary to reveal all relevant aspects of a model’s grammatical knowledge in a given domain.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — grammatical knowledge

🐣 Hot Topic Early Bird — pre-trained model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Alex Warstadt , Yu Cao , Ioana Grosu , Wei Peng , Hagen Blix , Yining Nie , Anna Alsop , Shikha Bordia , Haokun Liu , Alicia Parrish , Sheng-Fu Wang , Jason Phang , Anhad Mohananey , Phu Mon Htut , Paloma Jeretic , Samuel R. Bowman

Topics

Artificial Intelligence > Core AI > Foundation Models Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Language Models

Keywords

pre-trained model sentence representation linguistic analysis grammatical knowledge negative polarity item experimental method

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019