Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

Shaojie Jiang; Maarten de Rijke

2018 EMNLP EMNLP 2018

Why are Sequence-to-Sequence Models So Dull? Understanding the Low-Diversity Problem of Chatbots

Abstract

AbstractDiversity is a long-studied topic in information retrieval that usually refers to the requirement that retrieved results should be non-repetitive and cover different aspects. In a conversational setting, an additional dimension of diversity matters: an engaging response generation system should be able to output responses that are diverse and interesting. Sequence-to-sequence (Seq2Seq) models have been shown to be very effective for response generation. However, dialogue responses generated by Seq2Seq models tend to have low diversity. In this paper, we review known sources and existing approaches to this low-diversity problem. We also identify a source of low diversity that has been little studied so far, namely model over-confidence. We sketch several directions for tackling model over-confidence and, hence, the low-diversity problem, including confidence penalties and label smoothing.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — confidence penalty

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Shaojie Jiang , Maarten de Rijke

Topics

Machine Learning > Optimization & Theory > Loss Functions Deep Learning > Architectures > Transformers Natural Language Processing > Generation > Dialogue Systems Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Dialogue Systems Artificial Intelligence > Core AI > Language Artificial Intelligence > Core AI > Natural Language Generation Artificial Intelligence > Core AI > Dialogue Systems

Keywords

response generation label smoothing sequence-to-sequence model dialogue system confidence penalty dialogue diversity

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018