Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

Tianxing He; Bryan McCann; Caiming Xiong; Ehsan Hosseini-Asl

2021 EACL EACL 2021

Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models

Abstract

AbstractIn this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🐣 Hot Topic Early Bird — text encoder

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Tianxing He , Bryan McCann , Caiming Xiong , Ehsan Hosseini-Asl

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Optimization Natural Language Processing > Understanding > Semantic Analysis

Keywords

model calibration masked language model natural language understanding energy-based model text encoder noise contrastive estimation

Download PDF

Related papers

Joint Coreference Resolution and Character Linking for Multiparty Conversation 2021

Progressively Pretrained Dense Corpus Index for Open-Domain Question Answering 2021

Crisscrossed Captions: Extended Intramodal and Intermodal Semantic Similarity Judgments for MS-COCO 2021

Representations for Question Answering from Documents with Tables and Text 2021

Gender and Racial Fairness in Depression Research using Social Media 2021