2020
ACL
ACL 2020
Do you have the right scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods
Abstract
AbstractIt has been a common approach to pre-train a language model on a large corpus and fine-tune it on task-specific data. In practice, we observe that fine-tuning a pre-trained model on a small dataset may lead to over- and/or under-estimate problem. In this paper, we propose MC-Tailor, a novel method to alleviate the above issue in text generation tasks by truncating and transferring the probability mass from over-estimated regions to under-estimated ones. Experiments on a variety of text generation datasets show that MC-Tailor consistently and significantly outperforms the fine-tuning approach.
❓
The Questioner
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— probability mass truncation
🐣
Hot Topic Early Bird
— text generation
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Data Science & Analytics, Deep Learning, Interdisciplinary, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
Authors
Topics
Machine Learning > Optimization & Theory > Optimization
Machine Learning > Optimization & Theory > Stochastic Processes
Natural Language Processing > Generation > Text Generation
Natural Language Processing > Resources & Methods > Large Language Models
Machine Learning > Learning Types > Transfer Learning
Artificial Intelligence > Core AI > Large Language Models
Deep Learning > Learning Types > Representation Learning
Deep Learning > Learning Types > Fine-Tuning