A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios

Samuel Ackerman; Ella Rabinovich; Eitan Farchi; Ateret Anaby Tavor

2024 EMNLP EMNLP 2024

A Novel Metric for Measuring the Robustness of Large Language Models in Non-adversarial Scenarios

Abstract

AbstractWe evaluate the robustness of several large language models on multiple datasets. Robustness here refers to the relative insensitivity of the model’s answers to meaning-preserving variants of their input. Benchmark datasets are constructed by introducing naturally-occurring, non-malicious perturbations, or by generating semantically equivalent paraphrases of input questions or statements. We further propose a novel metric for assessing a model robustness, and demonstrate its benefits in the non-adversarial scenario by empirical evaluation of several models on the created datasets.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🧭 Keyword Pioneer — non-adversarial robustness

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Samuel Ackerman , Ella Rabinovich , Eitan Farchi , Ateret Anaby Tavor

Topics

Artificial Intelligence > Core AI > Interpretability Artificial Intelligence > Core AI > Responsible AI Machine Learning > Optimization & Theory > Theory Artificial Intelligence > Core AI > Large Language Models Deep Learning > Models > Large Language Models Machine Learning > Learning Types > Robustness

Keywords

paraphrase generation robustness evaluation semantic equivalence meaning preservation large language model non-adversarial robustness

Download PDF

Related papers

EmbodiedBERT: Cognitively Informed Metaphor Detection Incorporating Sensorimotor Information 2024

Mitigating Matthew Effect: Multi-Hypergraph Boosted Multi-Interest Self-Supervised Learning for Conversational Recommendation 2024

Learning to Extract Structured Entities Using Language Models 2024

Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis 2024

CSSL: Contrastive Self-Supervised Learning for Dependency Parsing on Relatively Free Word Ordered and Morphologically Rich Low Resource Languages 2024