BERT, are you paying attention? Attention regularization with human-annotated rationales
Abstract
AbstractAttention regularisation aims to supervise the attention patterns in language models like BERT. Various studies have shown that using human-annotated rationales, in the form of highlights that explain why a text has a specific label, can have positive effects on model generalisability. In this work, we ask to what extent attention regularisation with human-annotated rationales improve model performance and model robustness, as well as susceptibility to spurious correlations. We compare regularisation on human rationales with randomly selected tokens, a baseline which has hitherto remained unexplored.Our results suggest that often, attention regularisation with randomly selected tokens yields similar improvements to attention regularisation with human-annotated rationales. Nevertheless, we find that human-annotated rationales surpass randomly selected tokens when it comes to reducing model sensitivity to strong spurious correlations.