Trainable Ranking Models to Evaluate the Semantic Accuracy of Data-to-Text Neural Generator

Nicolas Garneau; Luc Lamontagne

2021 EMNLP EMNLP 2021

Trainable Ranking Models to Evaluate the Semantic Accuracy of Data-to-Text Neural Generator

Abstract

AbstractIn this paper, we introduce a new embedding-based metric relying on trainable ranking models to evaluate the semantic accuracy of neural data-to-text generators. This metric is especially well suited to semantically and factually assess the performance of a text generator when tables can be associated with multiple references and table values contain textual utterances. We first present how one can implement and further specialize the metric by training the underlying ranking models on a legal Data-to-Text dataset. We show how it may provide a more robust evaluation than other evaluation schemes in challenging settings using a dataset comprising paraphrases between the table values and their respective references. Finally, we evaluate its generalization capabilities on a well-known dataset, WebNLG, by comparing it with human evaluation and a recently introduced metric based on natural language inference. We then illustrate how it naturally characterizes, both quantitatively and qualitatively, omissions and hallucinations.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

📈 Trend Setter — Evaluation

🧭 Keyword Pioneer — trainable ranking model

🐣 Hot Topic Early Bird — hallucination detection

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Nicolas Garneau , Luc Lamontagne

Topics

Machine Learning > Core Methods > Metric Learning Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Information Extraction Natural Language Processing > Applications > Text Generation Machine Learning > Learning Types > Evaluation Deep Learning > Learning Types > Representation Learning Deep Learning > Learning Types > Evaluation

Keywords

embedding learning natural language inference text generation ranking model hallucination detection data-to-text generation text generation evaluation semantic accuracy trainable ranking model neural generator

Download PDF

Related papers

Continual Learning in Multilingual NMT via Language-Specific Embeddings 2021

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents 2021

Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity 2021

Neural Machine Translation with Heterogeneous Topic Knowledge Embeddings 2021

Semantics-Preserved Data Augmentation for Aspect-Based Sentiment Analysis 2021