GRUEN for Evaluating Linguistic Quality of Generated Text

Wanzheng Zhu; Suma Bhat

2020 EMNLP EMNLP 2020

GRUEN for Evaluating Linguistic Quality of Generated Text

Abstract

AbstractAutomatic evaluation metrics are indispensable for evaluating generated text. To date, these metrics have focused almost exclusively on the content selection aspect of the system output, ignoring the linguistic quality aspect altogether. We bridge this gap by proposing GRUEN for evaluating Grammaticality, non-Redundancy, focUs, structure and coherENce of generated text. GRUEN utilizes a BERT-based model and a class of syntactic, semantic, and contextual features to examine the system output. Unlike most existing evaluation metrics which require human references as an input, GRUEN is reference-less and requires only the system output. Besides, it has the advantage of being unsupervised, deterministic, and adaptable to various tasks. Experiments on seven datasets over four language generation tasks show that the proposed metric correlates highly with human judgments.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — linguistic quality evaluation

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Wanzheng Zhu , Suma Bhat

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Application Areas > Efficient Computing Natural Language Processing > Applications > Text Classification Natural Language Processing > Applications > Text Generation Deep Learning > Models > Transformers

Keywords

text generation syntactic parsing semantic coherence text generation evaluation bert-based model reference-less evaluation linguistic quality linguistic quality evaluation

Download PDF

Related papers

Fast semantic parsing with well-typedness guarantees 2020

Detecting Objectifying Language in Online Professor Reviews 2020

Analogous Process Structure Induction for Sub-event Sequence Prediction 2020

Aspect Sentiment Classification with Aspect-Specific Opinion Spans 2020

Robust and Interpretable Grounding of Spatial References with Relation Networks 2020