Protecting Privacy in Classifiers by Token Manipulation

Re’em Harel; Yair Elboher; Yuval Pinter

2024 ACL ACL 2024

Protecting Privacy in Classifiers by Token Manipulation

Abstract

AbstractUsing language models as a remote service entails sending private information to an untrusted provider. In addition, potential eavesdroppers can intercept the messages, thereby exposing the information. In this work, we explore the prospects of avoiding such data exposure at the level of text manipulation. We focus on text classification models, examining various token mapping and contextualized manipulation functions in order to see whether classifier accuracy may be maintained while keeping the original text unrecoverable. We find that although some token mapping functions are easy and straightforward to implement, they heavily influence performance on the downstream task, and via a sophisticated attacker can be reconstructed. In comparison, the contextualized manipulation provides an improvement in performance.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — token manipulation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Re’em Harel , Yair Elboher , Yuval Pinter

Topics

Machine Learning > Application Areas > Privacy Natural Language Processing > Applications > Text Classification Artificial Intelligence > Core AI > Privacy Artificial Intelligence > Core AI > Large Language Models Machine Learning > Learning Types > Privacy Deep Learning > Learning Types > Privacy

Keywords

text classification privacy preservation language model privacy protection classifier accuracy adversarial attacker token manipulation contextualized manipulation

Download PDF

Related papers

Reinforcement Learning-Driven LLM Agent for Automated Attacks on LLMs 2024

EtymoLink: A Structured English Etymology Dataset 2024

Turkish Delights: A Dataset on Turkish Euphemisms 2024

Subjectivity Detection in English News using Large Language Models 2024

Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better 2024