SumCSE: Summary as a transformation for Contrastive Learning

Raghuveer Thirukovalluru; Xiaolan Wang; Jun Chen; Shuyang Li; Jie Lei; Rong Jin; Bhuwan Dhingra

2024 NAACL NAACL 2024

SumCSE: Summary as a transformation for Contrastive Learning

Abstract

AbstractSentence embedding models are typically trained using contrastive learning (CL), either using human annotations directly or by repurposing other annotated datasets. In this work, we explore the recently introduced paradigm of generating CL data using generative language models (LM). In CL for computer vision (CV), compositional transformations (series of operations applied over an image. e.g. cropping + color distortion) which modify the input/image to retain minimal information were shown to be very effective. We show that composition of a ‘Summary’ transformation with diverse paraphrasing/contradicting transformations accomplishes the same and works very well in CL for sentence embeddings. Our final generated dataset (using Vicuna-13B) significantly outperforms the previous best unsupervised method (using ChatGPT) by 1.8 points, and SimCSE, a strong supervised baseline by 0.3 points on the semantic text similarity (STS) benchmark.

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Raghuveer Thirukovalluru , Xiaolan Wang , Jun Chen , Shuyang Li , Jie Lei , Rong Jin , Bhuwan Dhingra

Topics

Machine Learning > Core Methods > Embedding Learning Machine Learning > Learning Types > Contrastive Learning Machine Learning > Learning Types > Self-Supervised Learning

Keywords

unsupervised learning contrastive learning sentence embedding semantic text similarity generative language model

Download PDF

Related papers

Working Alliance Transformer for Psychotherapy Dialogue Classification 2024

Named Entity Recognition Under Domain Shift via Metric Learning for Life Sciences 2024

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 2024

TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation 2024

Extractive Summarization with Text Generator 2024