← Security & Privacy

Security & Privacy ›

Privacy

626 directly classified papers

Papers per year

Papers

PBI-Attack: Prior-Guided Bimodal Interactive Black-Box Jailbreak Attack for Toxicity Maximization NAACL 2025

Break the Breakout: Reinventing LM Defense Against Jailbreak Attacks with Self-Refine NAACL 2025

VLA-Mark: A cross modal watermark for large vision-language alignment models EMNLP 2025

TrojanStego: Your Language Model Can Secretly Be A Steganographic Privacy Leaking Agent EMNLP 2025

RecordTwin: Towards Creating Safe Synthetic Clinical Corpora ACL 2025

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection CVPR 2025

Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack ACL 2025

Variance-Based Membership Inference Attacks Against Large-Scale Image Captioning Models CVPR 2025

CLEAR: Character Unlearning in Textual and Visual Modalities ACL 2025

Attention to Neural Plagiarism: Diffusion Models Can Plagiarize Your Copyrighted Images! ICCV 2025

Mind the Cost of Scaffold! Benign Clients May Even Become Accomplices of Backdoor Attack ICCV 2025

Membership Inference Attacks with False Discovery Rate Control ICCV 2025

ALGEN: Few-shot Inversion Attacks on Textual Embeddings via Cross-Model Alignment and Generation ACL 2025

Merger-as-a-Stealer: Stealing Targeted PII from Aligned LLMs with Model Merging EMNLP 2025

Backdoor Mitigation by Distance-Driven Detoxification ICCV 2025

Stealthy Backdoor Attack in Federated Learning via Adaptive Layer-wise Gradient Alignment ICCV 2025

Geminio: Language-Guided Gradient Inversion Attacks in Federated Learning ICCV 2025

With Privacy, Size Matters: On the Importance of Dataset Size in Differentially Private Text Rewriting IJCNLP 2025

Resource-Efficient Anonymization of Textual Data via Knowledge Distillation from Large Language Models COLING 2025

MEraser: An Effective Fingerprint Erasure Approach for Large Language Models ACL 2025

Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety NAACL 2025

Improved Unbiased Watermark for Large Language Models ACL 2025

Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models NAACL 2025

CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor EMNLP 2025

AI Knows Where You Are: Exposure, Bias, and Inference in Multimodal Geolocation with KoreaGEO EMNLP 2025