2018
ACL
ACL 2018
Parser Training with Heterogeneous Treebanks
Abstract
AbstractHow to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previously suggested, but little evaluated, strategies for exploiting multiple treebanks based on concatenating training sets, with or without fine-tuning. We go on to propose a new method based on treebank embeddings. We perform experiments for several languages and show that in many cases fine-tuning and treebank embeddings lead to substantial improvements over single treebanks or concatenation, with average gains of 2.0–3.5 LAS points. We argue that treebank embeddings should be preferred due to their conceptual simplicity, flexibility and extensibility.
🌱
Topic Pioneer
— Multi-Lingual Learning
🌉
Interdisciplinary Bridge
— Machine Learning and Natural Language Processing
📈
Trend Setter
— Multi-Lingual Learning
🧭
Keyword Pioneer
— treebank embedding
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio
Authors
Topics
Machine Learning > Core Methods > Representation Learning
Machine Learning > Core Methods > Embedding Learning
Machine Learning > Optimization & Theory > Optimization
Natural Language Processing > Understanding > Parsing
Machine Learning > Learning Types > Transfer Learning
Machine Learning > Learning Types > Multi-Lingual Learning