Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Wenjuan Han; Yong Jiang; Kewei Tu

2017 EMNLP EMNLP 2017

Dependency Grammar Induction with Neural Lexicalization and Big Training Data

Abstract

AbstractWe study the impact of big models (in terms of the degree of lexicalization) and big data (in terms of the training corpus size) on dependency grammar induction. We experimented with L-DMV, a lexicalized version of Dependency Model with Valence (Klein and Manning, 2004) and L-NDMV, our lexicalized extension of the Neural Dependency Model with Valence (Jiang et al., 2016). We find that L-DMV only benefits from very small degrees of lexicalization and moderate sizes of training corpora. L-NDMV can benefit from big training data and lexicalization of greater degrees, especially when enhanced with good model initialization, and it achieves a result that is competitive with the current state-of-the-art.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — training data scaling

🐣 Hot Topic Early Bird — neural model

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Wenjuan Han , Yong Jiang , Kewei Tu

Topics

Machine Learning > Core Methods > Representation Learning Machine Learning > Learning Types > Unsupervised Learning Machine Learning > Bayesian & Probabilistic > Probabilistic Modeling Natural Language Processing > Applications > Natural Language Understanding

Keywords

unsupervised learning representation learning grammar induction probabilistic model training data scaling neural model dependency grammar neural network lexicalized model

Download PDF

Related papers

Reinforced Video Captioning with Entailment Rewards 2017

Cross-lingual Character-Level Neural Morphological Tagging 2017

Inter-Weighted Alignment Network for Sentence Pair Modeling 2017

Investigating Different Syntactic Context Types and Context Representations for Learning Word Embeddings 2017

An Empirical Analysis of Edit Importance between Document Versions 2017