Split and Rephrase: Better Evaluation and Stronger Baselines

Roee Aharoni; Yoav Goldberg

2018 ACL ACL 2018

Split and Rephrase: Better Evaluation and Stronger Baselines

Abstract

AbstractSplitting and rephrasing a complex sentence into several shorter sentences that convey the same meaning is a challenging problem in NLP. We show that while vanilla seq2seq models can reach high scores on the proposed benchmark (Narayan et al., 2017), they suffer from memorization of the training set which contains more than 89% of the unique simple sentences from the validation and test sets. To aid this, we present a new train-development-test data split and neural models augmented with a copy-mechanism, outperforming the best reported baseline by 8.68 BLEU and fostering further progress on the task.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning

🐣 Hot Topic Early Bird — text simplification

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Roee Aharoni , Yoav Goldberg

Topics

Artificial Intelligence > Core AI > Procedural Generation Machine Learning > Core Methods > Representation Learning Artificial Intelligence > Core AI > Natural Language Processing Deep Learning > Learning Types > Sequence Modeling

Keywords

text simplification sequence-to-sequence model sentence splitting neural model copy mechanism sentence simplification

Download PDF

Related papers

Economic Event Detection in Company-Specific News Text 2018

Investigating Effective Parameters for Fine-tuning of Word Embeddings Using Only a Small Corpus 2018

SemAxis: A Lightweight Framework to Characterize Domain-Specific Word Semantics Beyond Sentiment 2018

Fighting Offensive Language on Social Media with Unsupervised Text Style Transfer 2018

Affordances in Grounded Language Learning 2018