Generating Fluent Adversarial Examples for Natural Languages

Huangzhao Zhang; Hao Zhou; Ning Miao; Lei Li

2019 ACL ACL 2019

Generating Fluent Adversarial Examples for Natural Languages

Abstract

AbstractEfficiently building an adversarial attacker for natural language processing (NLP) tasks is a real challenge. Firstly, as the sentence space is discrete, it is difficult to make small perturbations along the direction of gradients. Secondly, the fluency of the generated examples cannot be guaranteed. In this paper, we propose MHA, which addresses both problems by performing Metropolis-Hastings sampling, whose proposal is designed with the guidance of gradients. Experiments on IMDB and SNLI show that our proposed MHAoutperforms the baseline model on attacking capability. Adversarial training with MHA also leads to better robustness and performance.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — nlp robustness

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

📈 Trend Setter — Robustness

Authors

Huangzhao Zhang , Hao Zhou , Ning Miao , Lei Li

Topics

Machine Learning > Learning Types > Adversarial Learning Natural Language Processing > Applications > Text Classification Artificial Intelligence > Core AI > Adversarial Learning Deep Learning > Learning Types > Adversarial Learning Artificial Intelligence > Core AI > Language Deep Learning > Learning Types > Robustness

Keywords

natural language processing text classification metropolis-hastings sampling adversarial example gradient-based attack nlp robustness gradient-guided perturbation text perturbation natural language robustness

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019