On the Robustness of Self-Attentive Models

Yu-Lun Hsieh; Minhao Cheng; Da-Cheng Juan; Wei Wei; Wen-Lian Hsu; Cho-jui Hsieh

2019 ACL ACL 2019

On the Robustness of Self-Attentive Models

Abstract

AbstractThis work examines the robustness of self-attentive neural networks against adversarial input perturbations. Specifically, we investigate the attention and feature extraction mechanisms of state-of-the-art recurrent neural networks and self-attentive architectures for sentiment analysis, entailment and machine translation under adversarial attacks. We also propose a novel attack algorithm for generating more natural adversarial examples that could mislead neural models but not humans. Experimental results show that, compared to recurrent neural models, self-attentive models are more robust against adversarial perturbation. In addition, we provide theoretical explanations for their superior robustness to support our claims.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

📈 Trend Setter — Adversarial Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Yu-Lun Hsieh , Minhao Cheng , Da-Cheng Juan , Wei Wei , Wen-Lian Hsu , Cho-jui Hsieh

Topics

Machine Learning > Application Areas > Efficient Computing Deep Learning > Architectures > Transformers Deep Learning > Architectures > Neural Networks Artificial Intelligence > Core AI > Natural Language Processing Deep Learning > Techniques > Adversarial Learning

Keywords

sentiment analysis machine translation adversarial attack adversarial example neural network

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019