BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology

Luke Gessler; Nathan Schneider

2021 EMNLP EMNLP 2021

BERT Has Uncommon Sense: Similarity Ranking for Word Sense BERTology

Abstract

AbstractAn important question concerning contextualized word embedding (CWE) models like BERT is how well they can represent different word senses, especially those in the long tail of uncommon senses. Rather than build a WSD system as in previous work, we investigate contextualized embedding neighborhoods directly, formulating a query-by-example nearest neighbor retrieval task and examining ranking performance for words and senses in different frequency bands. In an evaluation on two English sense-annotated corpora, we find that several popular CWE models all outperform a random baseline even for proportionally rare senses, without explicit sense supervision. However, performance varies considerably even among models with similar architectures and pretraining regimes, with especially large differences for rare word senses, revealing that CWE models are not all created equal when it comes to approximating word senses in their native representations.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — uncommon sense

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Luke Gessler , Nathan Schneider

Topics

Machine Learning > Core Methods > Metric Learning Machine Learning > Core Methods > Embedding Learning Natural Language Processing > Resources & Methods > Large Language Models Interdisciplinary > Linguistics > Computational Linguistics Natural Language Processing > Applications > Natural Language Inference Artificial Intelligence > Core AI > Language Deep Learning > Learning Types > Representation Learning Deep Learning > Models > Language Models

Keywords

word sense disambiguation nearest neighbor retrieval word embedding contextualized embedding contextualized word embedding similarity ranking word sense uncommon sense sense frequency

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021