2025
ACL
ACL 2025
mStyleDistance: Multilingual Style Embeddings and their Evaluation
Abstract
AbstractStyle embeddings are useful for stylistic analysis and style transfer, yet they only exist for English. We introduce Multilingual StyleDistance (mStyleDistance), a method that can generate style embeddings in new languages using synthetic data and a contrastive loss. We create style embeddings in nine languages and a multilingual STEL-or-Content benchmark (Wegmann et al., 2022) that serves to assess their quality. We also employ our embeddings in an authorship verification task involving different languages. Our results show that mStyleDistance embeddings outperform existing style embeddings on these benchmarks and generalize well to unseen features and languages. We make our models and datasets publicly available.
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
🧭
Keyword Pioneer
— authorship verification
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Embedding Learning
Machine Learning > Learning Types > Contrastive Learning
Natural Language Processing > Resources & Methods > Multilingual NLP
Natural Language Processing > Resources & Methods > Text Representation
Interdisciplinary > Linguistics > Computational Linguistics
Deep Learning > Techniques > Contrastive Learning