Multi Document Summarization Evaluation in the Presence of Damaging Content

Avshalom Manevich; David Carmel; Nachshon Cohen; Elad Kravi; Ori Shapira

2023 EMNLP EMNLP 2023

Multi Document Summarization Evaluation in the Presence of Damaging Content

Abstract

AbstractIn the Multi-document summarization (MDS) task, a summary is produced for a given set of documents. A recent line of research introduced the concept of damaging documents, denoting documents that should not be exposed to readers due to various reasons. In the presence of damaging documents, a summarizer is ideally expected to exclude damaging content in its output. Existing metrics evaluate a summary based on aspects such as relevance and consistency with the source documents. We propose to additionally measure the ability of MDS systems to properly handle damaging documents in their input set. To that end, we offer two novel metrics based on lexical similarity and language model likelihood. A set of experiments demonstrates the effectiveness of our metrics in measuring the ability of MDS systems to summarize a set of documents while eliminating damaging content from their summaries.

🧭 Keyword Pioneer — damaging content

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Avshalom Manevich , David Carmel , Nachshon Cohen , Elad Kravi , Ori Shapira

Topics

Natural Language Processing > Generation > Summarization

Keywords

language model evaluation metric multi-document summarization lexical similarity damaging content

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023