HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning

Zhenyi Wang; Yapeng Jia; Haiyan Ning; Peng Wang; Dan Wang; Yitao Cao

2025 EMNLP EMNLP 2025

HMCL: Task-Optimal Text Representation Adaptation through Hierarchical Contrastive Learning

Abstract

AbstractAs general large language models continue to advance, their real-world adaptation through effective fine-tuning remains a significant challenge. We introduce Hierarchical Multilevel Contrastive Learning (HMCL), a new contrastive learning framework that improves task-specific text representation for general models. HMCL integrates 3-level semantic differentiation (positive, weak-positive, and negative) and unifies contrastive learning, pair classification, and ranking objectives into a cohesive optimization strategy. HMCL demonstrates exceptional results across multi-domain and multilingual benchmarks, including text similarity, retrieval, reranking and Retrieval-Augmented Generation (RAG) tasks. It outperforms top unsupervised methods and supervised fine-tuning approaches while maintaining broad compatibility with architectures ranging from BERT to Qwen, 330M to 7B. In real-world merchant consultation scenarios, HMCL shows a 0.70-6.24 point improvement over original fine-tuning methods in large-scale base models. This establishes HMCL as a versatile solution that bridges the gap between general-purpose models and specialized industrial applications.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — representation adaptation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Zhenyi Wang , Yapeng Jia , Haiyan Ning , Peng Wang , Dan Wang , Yitao Cao

Topics

Artificial Intelligence > Learning Paradigms > Transfer Learning Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Contrastive Learning Machine Learning > Learning Types > Transfer Learning Natural Language Processing > Resources & Methods > Transfer Learning Deep Learning > Learning Types > Contrastive Learning

Keywords

contrastive learning information retrieval text representation retrieval-augmented generation semantic differentiation hierarchical contrastive learning representation adaptation

Download PDF

Related papers

Bit-Flip Error Resilience in LLMs: A Comprehensive Analysis and Defense Framework 2025

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing 2025

Model-based Large Language Model Customization as Service 2025

ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration 2025

SlideCoder: Layout-aware RAG-enhanced Hierarchical Slide Generation from Design 2025