CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization

Semih Yavuz; Chung-Cheng Chiu; Patrick Nguyen; Yonghui Wu

2018 EMNLP EMNLP 2018

CaLcs: Continuously Approximating Longest Common Subsequence for Sequence Level Optimization

Abstract

AbstractMaximum-likelihood estimation (MLE) is one of the most widely used approaches for training structured prediction models for text-generation based natural language processing applications. However, besides exposure bias, models trained with MLE suffer from wrong objective problem where they are trained to maximize the word-level correct next step prediction, but are evaluated with respect to sequence-level discrete metrics such as ROUGE and BLEU. Several variants of policy-gradient methods address some of these problems by optimizing for final discrete evaluation metrics and showing improvements over MLE training for downstream tasks like text summarization and machine translation. However, policy-gradient methods suffers from high sample variance, making the training process very difficult and unstable. In this paper, we present an alternative direction towards mitigating this problem by introducing a new objective (CaLcs) based on a differentiable surrogate of longest common subsequence (LCS) measure that captures sequence-level structure similarity. Experimental results on abstractive summarization and machine translation validate the effectiveness of the proposed approach.

🌉 Interdisciplinary Bridge — Deep Learning and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — sequence-level optimization

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Semih Yavuz , Chung-Cheng Chiu , Patrick Nguyen , Yonghui Wu

Topics

Machine Learning > Optimization & Theory > Optimization Natural Language Processing > Generation > Summarization Natural Language Processing > Generation > Text Generation Natural Language Processing > Applications > Machine Translation Natural Language Processing > Applications > Summarization Deep Learning > Learning Types > Reinforcement Learning

Keywords

policy gradient machine translation text generation text summarization abstractive summarization sequence-level optimization longest common subsequence

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018