Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

Lifeng Jin; Finale Doshi-velez; Timothy Miller; William Schuler; Lane Schwartz

2018 EMNLP EMNLP 2018

Depth-bounding is effective: Improvements and evaluation of unsupervised PCFG induction

Abstract

AbstractThere have been several recent attempts to improve the accuracy of grammar induction systems by bounding the recursive complexity of the induction model. Modern depth-bounded grammar inducers have been shown to be more accurate than early unbounded PCFG inducers, but this technique has never been compared against unbounded induction within the same system, in part because most previous depth-bounding models are built around sequence models, the complexity of which grows exponentially with the maximum allowed depth. The present work instead applies depth bounds within a chart-based Bayesian PCFG inducer, where bounding can be switched on and off, and then samples trees with or without bounding. Results show that depth-bounding is indeed significantly effective in limiting the search space of the inducer and thereby increasing accuracy of resulting parsing model, independent of the contribution of modern Bayesian induction techniques. Moreover, parsing results on English, Chinese and German show that this bounded model is able to produce parse trees more accurately than or competitively with state-of-the-art constituency grammar induction models.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Interdisciplinary and Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — pcfg induction

🐣 Hot Topic Early Bird — constituency parsing

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Lifeng Jin , Finale Doshi-velez , Timothy Miller , William Schuler , Lane Schwartz

Topics

Machine Learning > Learning Types > Unsupervised Learning Natural Language Processing > Understanding > Parsing Interdisciplinary > Linguistics > Computational Linguistics Machine Learning > Optimization & Theory > Stochastic Methods Artificial Intelligence > Bayesian & Probabilistic > Bayesian Inference Machine Learning > Bayesian & Probabilistic > Bayesian Inference Natural Language Processing > Applications > Natural Language Understanding

Keywords

unsupervised learning bayesian learning grammar induction constituency parsing probabilistic context-free grammar pcfg induction bayesian induction bayesian pcfg

Download PDF

Related papers

Speeding Up Neural Machine Translation Decoding by Cube Pruning 2018

Limitations in learning an interpreted language with recurrent models 2018

Results of the sixth edition of the BioASQ Challenge 2018

Neural Segmental Hypergraphs for Overlapping Mention Recognition 2018

Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates 2018