Self-Ensemble of N-best Generation Hypotheses by Lexically Constrained Decoding

Ryota Miyano; Tomoyuki Kajiwara; Yuki Arase

2023 EMNLP EMNLP 2023

Self-Ensemble of N-best Generation Hypotheses by Lexically Constrained Decoding

Abstract

AbstractWe propose a method that ensembles N-best hypotheses to improve natural language generation. Previous studies have achieved notable improvements in generation quality by explicitly reranking N-best candidates. These studies assume that there exists a hypothesis of higher quality. We expand the assumption to be more practical as there exist partly higher quality hypotheses in the N-best yet they may be imperfect as the entire sentences. By merging these high-quality fragments, we can obtain a higher-quality output than the single-best sentence. Specifically, we first obtain N-best hypotheses and conduct token-level quality estimation. We then apply tokens that should or should not be present in the final output as lexical constraints in decoding. Empirical experiments on paraphrase generation, summarisation, and constrained text generation confirm that our method outperforms the strong N-best reranking methods.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — hypothesis ensemble

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ryota Miyano , Tomoyuki Kajiwara , Yuki Arase

Topics

Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Generation > Text Generation Machine Learning > Core Methods > Ranking Deep Learning > Learning Types > Ensemble Learning

Keywords

natural language generation text generation quality estimation ensemble method lexically constrained decoding lexical constraint n-best reranking hypothesis ensemble sentence quality token-level quality estimation

Download PDF

Related papers

Exploring Linguistic Probes for Morphological Generalization 2023

NameGuess: Column Name Expansion for Tabular Data 2023

Vision-Enhanced Semantic Entity Recognition in Document Images via Visually-Asymmetric Consistency Learning 2023

Improving Conversational Recommendation Systems via Bias Analysis and Language-Model-Enhanced Data Augmentation 2023

On the Calibration of Large Language Models and Alignment 2023