Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation

Go Yasui; Yoshimasa Tsuruoka; Masaaki Nagata

2019 ACL ACL 2019

Using Semantic Similarity as Reward for Reinforcement Learning in Sentence Generation

Abstract

AbstractTraditional model training for sentence generation employs cross-entropy loss as the loss function. While cross-entropy loss has convenient properties for supervised learning, it is unable to evaluate sentences as a whole, and lacks flexibility. We present the approach of training the generation model using the estimated semantic similarity between the output and reference sentences to alleviate the problems faced by the training with cross-entropy loss. We use the BERT-based scorer fine-tuned to the Semantic Textual Similarity (STS) task for semantic similarity estimation, and train the model with the estimated scores through reinforcement learning (RL). Our experiments show that reinforcement learning with semantic similarity reward improves the BLEU scores from the baseline LSTM NMT model.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing and Reinforcement Learning

🧭 Keyword Pioneer — bert-based scoring

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Go Yasui , Yoshimasa Tsuruoka , Masaaki Nagata

Topics

Machine Learning > Learning Types > Self-Supervised Learning Natural Language Processing > Generation > Text Generation Reinforcement Learning > Methods > Deep RL Deep Learning > Learning Types > Reinforcement Learning Deep Learning > Architectures > Recurrent Neural Networks

Keywords

reinforcement learning neural machine translation semantic similarity reward shaping sentence generation bert-based scoring

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019