Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization

Sahil Wadhwa; Chengtian Xu; Haoming Chen; Aakash Mahalingam; Akankshya Kar; Divya Chaudhary

2025 COLING COLING 2025

Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization

Abstract

AbstractThe automatic generation of counter-speech (CS) is a critical strategy for addressing hate speech by providing constructive and informed responses. However, existing methods often fail to generate high-quality, impactful, and scalable CS, particularly across diverse lin- guistic contexts. In this paper, we propose a novel methodology to enhance CS generation by aligning Large Language Models (LLMs) using Supervised Fine-Tuning (SFT) and Di- rect Preference Optimization (DPO). Our ap- proach leverages DPO to align LLM outputs with human preferences, ensuring contextu- ally appropriate and linguistically adaptable responses. Additionally, we incorporate knowl- edge grounding to enhance the factual accuracy and relevance of generated CS. Experimental results demonstrate that DPO-aligned models significantly outperform SFT baselines on CS benchmarks while scaling effectively to mul- tiple languages. These findings highlight the potential of preference-based alignment tech- niques to advance CS generation across var- ied linguistic settings. The model supervision and alignment is done in English and the same model is used for reporting metrics across other languages like Basque, Italian, and Spanish.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Sahil Wadhwa , Chengtian Xu , Haoming Chen , Aakash Mahalingam , Akankshya Kar , Divya Chaudhary

Topics

Natural Language Processing > Generation > Text Generation Machine Learning > Learning Types > Reinforcement Learning Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Fine-Tuning

Keywords

direct preference optimization supervised fine-tuning human preference alignment knowledge grounding multilingual natural language processing counterspeech generation large language model

Download PDF

Related papers

Navigating Dialectal Bias and Ethical Complexities in Levantine Arabic Hate Speech Detection 2025

TaCIE: Enhancing Instruction Comprehension in Large Language Models through Task-Centred Instruction Evolution 2025

Positive Text Reframing under Multi-strategy Optimization 2025

RAM2C: A Liberal Arts Educational Chatbot based on Retrieval-augmented Multi-role Multi-expert Collaboration 2025

Two-stage Incomplete Utterance Rewriting on Editing Operation 2025