Probing for Bridging Inference in Transformer Language Models

Onkar Pandit; Yufang Hou

2021 NAACL NAACL 2021

Probing for Bridging Inference in Transformer Language Models

Abstract

AbstractWe probe pre-trained transformer language models for bridging inference. We first investigate individual attention heads in BERT and observe that attention heads at higher layers prominently focus on bridging relations in-comparison with the lower and middle layers, also, few specific attention heads concentrate consistently on bridging. More importantly, we consider language models as a whole in our second approach where bridging anaphora resolution is formulated as a masked token prediction task (Of-Cloze test). Our formulation produces optimistic results without any fine-tuning, which indicates that pre-trained language models substantially capture bridging inference. Our further investigation shows that the distance between anaphor-antecedent and the context provided to language models play an important role in the inference.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🧭 Keyword Pioneer — bridging inference

🐣 Hot Topic Early Bird — attention head

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Onkar Pandit , Yufang Hou

Topics

Artificial Intelligence > Core AI > Interpretability Natural Language Processing > Understanding > Semantic Analysis Natural Language Processing > Resources & Methods > Large Language Models

Keywords

attention head pre-trained language model masked token prediction anaphora resolution transformer model bridging inference

Download PDF

Related papers

Knowledge Router: Learning Disentangled Representations for Knowledge Graphs 2021

Cross-Task Instance Representation Interactions and Label Dependencies for Joint Information Extraction with Graph Convolutional Networks 2021

Abstract Meaning Representation Guided Graph Encoding and Decoding for Joint Information Extraction 2021

Beyond Fair Pay: Ethical Implications of NLP Crowdsourcing 2021

Probing Word Translations in the Transformer and Trading Decoder for Encoder Layers 2021