Leveraging Pre-Trained Embeddings for Welsh Taggers

Ignatius Ezeani; Scott Piao; Steven Neale; Paul Rayson; Dawn Knight

2019 ACL ACL 2019

Leveraging Pre-Trained Embeddings for Welsh Taggers

Abstract

AbstractWhile the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.

🌉 Interdisciplinary Bridge — Machine Learning and Natural Language Processing

🧭 Keyword Pioneer — multi-task neural network

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Security & Privacy, Speech & Audio

Authors

Ignatius Ezeani , Scott Piao , Steven Neale , Paul Rayson , Dawn Knight

Topics

Machine Learning > Learning Types > Semi-Supervised Learning Natural Language Processing > Understanding > Part-of-Speech Tagging Natural Language Processing > Resources & Methods > Text Representation

Keywords

part-of-speech tagging low-resource language pre-trained embedding semantic tagging multi-task neural network

Download PDF

Related papers

What do phone embeddings learn about Phonology? 2019

Unsupervised Morphological Segmentation for Low-Resource Polysynthetic Languages 2019

Understanding Undesirable Word Embedding Associations 2019

Inferential Machine Comprehension: Answering Questions by Recursively Deducing the Evidence Chain from Text 2019

Domain Adaptation of Neural Machine Translation by Lexicon Induction 2019