Split or Merge: Which is Better for Unsupervised RST Parsing?

Naoki Kobayashi; Tsutomu Hirao; Kengo Nakamura; Hidetaka Kamigaito; Manabu Okumura; Masaaki Nagata

2019 EMNLP EMNLP 2019

Split or Merge: Which is Better for Unsupervised RST Parsing?

Abstract

AbstractRhetorical Structure Theory (RST) parsing is crucial for many downstream NLP tasks that require a discourse structure for a text. Most of the previous RST parsers have been based on supervised learning approaches. That is, they require an annotated corpus of sufficient size and quality, and heavily rely on the language and domain dependent corpus. In this paper, we present two language-independent unsupervised RST parsing methods based on dynamic programming. The first one builds the optimal tree in terms of a dissimilarity score function that is defined for splitting a text span into smaller ones. The second builds the optimal tree in terms of a similarity score function that is defined for merging two adjacent spans into a large one. Experimental results on English and German RST treebanks showed that our parser based on span merging achieved the best score, around 0.8 F1 score, which is close to the scores of the previous supervised parsers.

❓ The Questioner

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — span merging

🐣 Hot Topic Early Bird — dynamic programming

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Naoki Kobayashi , Tsutomu Hirao , Kengo Nakamura , Hidetaka Kamigaito , Manabu Okumura , Masaaki Nagata

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Clustering Machine Learning > Core Methods > Metric Learning Machine Learning > Learning Types > Unsupervised Learning Machine Learning > Learning Paradigms > Unsupervised Learning Natural Language Processing > Applications > Text Processing

Keywords

unsupervised learning unsupervised parsing dynamic programming discourse parsing rhetorical structure theory document structure span merging rst parsing tree construction

Download PDF

Related papers

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation 2019

Chains-of-Reasoning at TextGraphs 2019 Shared Task: Reasoning over Chains of Facts for Explainable Multi-hop Inference 2019

A Boundary-aware Neural Model for Nested Named Entity Recognition 2019

Iterative Dual Domain Adaptation for Neural Machine Translation 2019

A Multi-Pairwise Extension of Procrustes Analysis for Multilingual Word Translation 2019