Constituent Parsing as Sequence Labeling

Carlos Gómez-Rodríguez; David Vilares

2018 EMNLP EMNLP 2018

Constituent Parsing as Sequence Labeling

Abstract

AbstractWe introduce a method to reduce constituent parsing to sequence labeling. For each word wt, it generates a label that encodes: (1) the number of ancestors in the tree that the words wt and wt+1 have in common, and (2) the nonterminal symbol at the lowest common ancestor. We first prove that the proposed encoding function is injective for any tree without unary branches. In practice, the approach is made extensible to all constituency trees by collapsing unary branches. We then use the PTB and CTB treebanks as testbeds and propose a set of fast baselines. We achieve 90% F-score on the PTB test set, outperforming the Vinyals et al. (2015) sequence-to-sequence parser. In addition, sacrificing some accuracy, our approach achieves the fastest constituent parsing speeds reported to date on PTB by a wide margin.

🌉 Interdisciplinary Bridge — Interdisciplinary and Machine Learning and Natural Language Processing

📈 Trend Setter — Syntax

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Carlos Gómez-Rodríguez , David Vilares

Topics

Natural Language Processing > Understanding > Parsing Machine Learning > Learning Types > Classification Interdisciplinary > Linguistics > Syntax Machine Learning > Core Methods > Sequence Labeling

Keywords

sequence labeling syntactic parsing span prediction tree parsing constituent parsing

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018