Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Boxi Cao; Hongyu Lin; Xianpei Han; Le Sun; Lingyong Yan; Meng Liao; Tong Xue; Jin Xu

2021 ACL ACL 2021

Knowledgeable or Educated Guess? Revisiting Language Models as Knowledge Bases

Abstract

AbstractPrevious literatures show that pre-trained masked language models (MLMs) such as BERT can achieve competitive factual knowledge extraction performance on some datasets, indicating that MLMs can potentially be a reliable knowledge source. In this paper, we conduct a rigorous study to explore the underlying predicting mechanisms of MLMs over different extraction paradigms. By investigating the behaviors of MLMs, we find that previous decent performance mainly owes to the biased prompts which overfit dataset artifacts. Furthermore, incorporating illustrative cases and external contexts improve knowledge prediction mainly due to entity type guidance and golden answer leakage. Our findings shed light on the underlying predicting mechanisms of MLMs, and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases.

❓ The Questioner

📈 Trend Setter — Natural Language Inference

🧭 Keyword Pioneer — prompt engineering

🐝 Cross-Pollinator — Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — prompt engineering

Authors

Boxi Cao , Hongyu Lin , Xianpei Han , Le Sun , Lingyong Yan , Meng Liao , Tong Xue , Jin Xu

Topics

Natural Language Processing > Generation > Language Modeling Natural Language Processing > Resources & Methods > Knowledge Editing Natural Language Processing > Resources & Methods > Large Language Models Natural Language Processing > Resources & Methods > Natural Language Inference Machine Learning > Learning Types > Transfer Learning Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Evaluation Deep Learning > Models > Language Models

Keywords

representation learning knowledge extraction prompt engineering masked language model factual knowledge knowledge base prompt bia dataset artifact

Download PDF

Related papers

Out-of-Scope Intent Detection with Self-Supervision and Discriminative Training 2021

A Non-Autoregressive Edit-Based Approach to Controllable Text Simplification 2021

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements 2021

Exploring Discourse Structures for Argument Impact Classification 2021

Language Embeddings for Typology and Cross-lingual Transfer Learning 2021