BERTering RAMS: What and How Much does BERT Already Know About Event Arguments? - A Study on the RAMS Dataset

Varun Gangal; Eduard Hovy

2020 EMNLP EMNLP 2020

BERTering RAMS: What and How Much does BERT Already Know About Event Arguments? - A Study on the RAMS Dataset

Abstract

AbstractUsing the attention map based probing framework from (Clark et al., 2019), we observe that, on the RAMS dataset (Ebner et al., 2020), BERT’s attention heads have modest but well above-chance ability to spot event arguments sans any training or domain finetuning, varying from a low of 17.77% for Place to a high of 51.61% for Artifact. Next, we find that linear combinations of these heads, estimated with approx. 11% of available total event argument detection supervision, can push performance well higher for some roles — highest two being Victim (68.29% Accuracy) and Artifact (58.82% Accuracy). Furthermore, we investigate how well our methods do for cross-sentence event arguments. We propose a procedure to isolate “best heads” for cross-sentence argument detection separately of those for intra-sentence arguments. The heads thus estimated have superior cross-sentence performance compared to their jointly estimated equivalents, albeit only under the unrealistic assumption that we already know the argument is present in another sentence. Lastly, we seek to isolate to what extent our numbers stem from lexical frequency based associations between gold arguments and roles. We propose NONCE, a scheme to create adversarial test examples by replacing gold arguments with randomly generated “nonce” words. We find that learnt linear combinations are robust to NONCE, though individual best heads can be more sensitive.

❓ The Questioner

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — attention probing

🐣 Hot Topic Early Bird — attention head

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Varun Gangal , Eduard Hovy

Topics

Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Applications > Information Extraction Computer Science > Applications > Document Analysis Deep Learning > Models > Transformers Deep Learning > Techniques > Attention

Keywords

few-shot learning attention head pretrained language model event argument detection attention probing cross-sentence argument

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020