ARCH: Efficient Adversarial Regularized Training with Caching

Simiao Zuo; Chen Liang; Haoming Jiang; Pengcheng He; Xiaodong Liu; Jianfeng Gao; Weizhu Chen; Tuo Zhao

2021 EMNLP EMNLP 2021

ARCH: Efficient Adversarial Regularized Training with Caching

Abstract

AbstractAdversarial regularization can improve model generalization in many natural language processing tasks. However, conventional approaches are computationally expensive since they need to generate a perturbation for each sample in each epoch. We propose a new adversarial regularization method ARCH (adversarial regularization with caching), where perturbations are generated and cached once every several epochs. As caching all the perturbations imposes memory usage concerns, we adopt a K-nearest neighbors-based strategy to tackle this issue. The strategy only requires caching a small amount of perturbations, without introducing additional training time. We evaluate our proposed method on a set of neural machine translation and natural language understanding tasks. We observe that ARCH significantly eases the computational burden (saves up to 70% of computational time in comparison with conventional approaches). More surprisingly, by reducing the variance of stochastic gradients, ARCH produces a notably better (in most of the tasks) or comparable model generalization. Our code is publicly available.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — k-nearest neighbor

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Simiao Zuo , Chen Liang , Haoming Jiang , Pengcheng He , Xiaodong Liu , Jianfeng Gao , Weizhu Chen , Tuo Zhao

Topics

Machine Learning > Learning Types > Adversarial Learning Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Generation > Machine Translation Deep Learning > Optimization & Theory > Neural Network Optimization Deep Learning > Learning Types > Adversarial Learning Deep Learning > Optimization & Theory > Efficient Computing

Keywords

neural machine translation adversarial training natural language understanding k-nearest neighbor model generalization gradient variance adversarial regularization gradient caching

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021