Whitening Not Recommended for Classification Tasks in LLMs

Ali Forooghi; Shaghayegh Sadeghi; Jianguo Lu

2024 ACL ACL 2024

Whitening Not Recommended for Classification Tasks in LLMs

Abstract

AbstractSentence embedding is a cornerstone in NLP. Whitening has been claimed to be an effective method to improve embeddings obtained from Large Language Models (LLMs) for sentence embedding. However, we find that the effectiveness of whitening is model-dependent and task-dependent. In particular, whitening degenerates embeddings for classification tasks. The conclusion is supported by extensive experiments. A by-product of our research is embedding evaluation platform for LLMs called SentEval+

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ali Forooghi , Shaghayegh Sadeghi , Jianguo Lu

Topics

Machine Learning > Core Methods > Classification Natural Language Processing > Resources & Methods > Text Representation Natural Language Processing > Resources & Methods > Language Modeling Deep Learning > Models > Large Language Models Deep Learning > Learning Types > Representation Learning

Keywords

classification task sentence embedding whitening transformation large language model embedding evaluation

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024