2022
EMNLP
EMNLP 2022
VarMAE: Pre-training of Variational Masked Autoencoder for Domain-adaptive Language Understanding
Abstract
AbstractPre-trained language models have been widely applied to standard benchmarks. Due to the flexibility of natural language, the available resources in a certain domain can be restricted to support obtaining precise representation. To address this issue, we propose a novel Transformer-based language model named VarMAE for domain-adaptive language understanding. Under the masked autoencoding objective, we design a context uncertainty learning module to encode the token’s context into a smooth latent distribution. The module can produce diverse and well-formed contextual representations. Experiments on science- and finance-domain NLU tasks demonstrate that VarMAE can be efficiently adapted to new domains with limited resources.
🌉
Interdisciplinary Bridge
— Deep Learning and Machine Learning and Natural Language Processing
🐣
Hot Topic Early Bird
— masked autoencoder
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Machine Learning > Application Areas > Domain Adaptation
Deep Learning > Models > Variational Inference
Natural Language Processing > Resources & Methods > Large Language Models
Natural Language Processing > Resources & Methods > Transfer Learning
Machine Learning > Learning Types > Domain Adaptation
Deep Learning > Learning Types > Self-Supervised Learning
Deep Learning > Models > Transformers
Deep Learning > Learning Types > Domain Adaptation