DistillCSE: Distilled Contrastive Learning for Sentence Embeddings

Jiahao Xu; Wei Shao; Lihui Chen; Lemao Liu

2023 EMNLP EMNLP 2023

DistillCSE: Distilled Contrastive Learning for Sentence Embeddings

Abstract

AbstractThis paper proposes the DistillCSE framework, which performs contrastive learning under the self-training paradigm with knowledge distillation. The potential advantage of DistillCSE is its self-enhancing feature: using a base model to provide additional supervision signals, a stronger model may be learned through knowledge distillation. However, the vanilla DistillCSE through the standard implementation of knowledge distillation only achieves marginal improvements. The quantitative analyses demonstrate its reason that the standard knowledge distillation exhibits a relatively large variance of the teacher model’s logits due to the essence of contrastive learning. To mitigate the issue induced by high variance, this paper accordingly proposed two simple yet effective solutions for knowledge distillation: a Group-P shuffling strategy as an implicit regularization and the averaging logits from multiple teacher components. Experiments on standard benchmarks demonstrate that the proposed DistillCSE outperforms many strong baseline methods and yields a new state-of-the-art performance.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Jiahao Xu , Wei Shao , Lihui Chen , Lemao Liu

Topics

Machine Learning > Core Methods > Embedding Learning Machine Learning > Learning Types > Contrastive Learning Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Resources & Methods > Text Representation Machine Learning > Learning Types > Knowledge Distillation Deep Learning > Techniques > Contrastive Learning Deep Learning > Learning Types > Self-Supervised Learning Deep Learning > Learning Types > Contrastive Learning Deep Learning > Techniques > Knowledge Distillation

Keywords

model compression representation learning contrastive learning self-supervised learning knowledge distillation sentence embedding

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023