Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Julian Michael; Jan A. Botha; Ian Tenney

2020 EMNLP EMNLP 2020

Asking without Telling: Exploring Latent Ontologies in Contextual Representations

Abstract

AbstractThe success of pretrained contextual encoders, such as ELMo and BERT, has brought a great deal of interest in what these models learn: do they, without explicit supervision, learn to encode meaningful notions of linguistic structure? If so, how is this structure encoded? To investigate this, we introduce latent subclass learning (LSL): a modification to classifier-based probing that induces a latent categorization (or ontology) of the probe’s inputs. Without access to fine-grained gold labels, LSL extracts emergent structure from input representations in an interpretable and quantifiable form. In experiments, we find strong evidence of familiar categories, such as a notion of personhood in ELMo, as well as novel ontological distinctions, such as a preference for fine-grained semantic roles on core arguments. Our results provide unique new evidence of emergent structure in pretrained encoders, including departures from existing annotations which are inaccessible to earlier methods.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — pretrained contextual encoder

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Julian Michael , Jan A. Botha , Ian Tenney

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Representation Learning Natural Language Processing > Resources & Methods > Text Representation Interdisciplinary > Linguistics > Computational Linguistics Deep Learning > Learning Types > Representation Learning Natural Language Processing > Understanding > Semantics

Keywords

representation learning contextual representation pretrained language model latent structure contextual embedding probe classifier semantic role probe analysis pretrained contextual encoder latent ontology

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020