Self-Attention Guided Copy Mechanism for Abstractive Summarization

Song Xu; Haoran Li; Peng Yuan; Youzheng Wu; Xiaodong He; Bowen Zhou

2020 ACL ACL 2020

Self-Attention Guided Copy Mechanism for Abstractive Summarization

Abstract

AbstractCopy module has been widely equipped in the recent abstractive summarization models, which facilitates the decoder to extract words from the source into the summary. Generally, the encoder-decoder attention is served as the copy distribution, while how to guarantee that important words in the source are copied remains a challenge. In this work, we propose a Transformer-based model to enhance the copy mechanism. Specifically, we identify the importance of each source word based on the degree centrality with a directed graph built by the self-attention layer in the Transformer. We use the centrality of each source word to guide the copy process explicitly. Experimental results show that the self-attention graph provides useful guidance for the copy distribution. Our proposed models significantly outperform the baseline methods on the CNN/Daily Mail dataset and the Gigaword dataset.

🌉 Interdisciplinary Bridge — Computer Science and Deep Learning and Natural Language Processing

🧭 Keyword Pioneer — degree centrality

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio