SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Anil Ramakrishna; Yixin Wan; Xiaomeng Jin; Kai-Wei Chang; Zhiqi Bu; Bhanukiran Vinzamuri; Volkan Cevher; Mingyi Hong; Rahul Gupta

2025 SEMEVAL SemEval 2025

SemEval-2025 Task 4: Unlearning sensitive content from Large Language Models

Abstract

AbstractWe introduce SemEval-2025 Task 4: unlearn- ing sensitive content from Large Language Models (LLMs). The task features 3 subtasks for LLM unlearning spanning different use cases: (1) unlearn long form synthetic creative documents spanning different genres; (2) un- learn short form synthetic biographies contain- ing personally identifiable information (PII), in- cluding fake names, phone number, SSN, email and home addresses, and (3) unlearn real docu- ments sampled from the target model’s training dataset. We received over 100 submissions from over 30 institutions and we summarize the key techniques and lessons in this paper.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio