2025 EMNLP EMNLP 2025

Exploring and Detecting Self-disclosure in Multi-modal posts on Chinese Social Media

Abstract

AbstractSelf-disclosure can provide psychological comfort and social support, but it also carries the risk of unintentionally revealing sensitive information, leading to serious privacy concerns. Research on self-disclosure in Chinese multimodal contexts remains limited, lacking high-quality corpora, analysis, and methods for detection. This work focuses on self-disclosure behaviors on Chinese multimodal social media platforms and constructs a high-quality text-image corpus to address this critical data gap. We systematically analyze the distribution of self-disclosure types, modality preferences, and their relationship with user intent, uncovering expressive patterns unique to the Chinese multimodal context. We also fine-tune five multimodal large language models to enhance self-disclosure detection in multimodal scenarios. Among these models, the Qwen2.5-omni-7B achieved a strong performance, with a partial span F1 score of 88.2%. This study provides a novel research perspective on multimodal self-disclosure in the Chinese context.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — text-image corpus
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio