2018
ACL
ACL 2018
Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data
Abstract
AbstractManually labeled corpora are expensive to create and often not available for low-resource languages or domains. Automatic labeling approaches are an alternative way to obtain labeled data in a quicker and cheaper way. However, these labels often contain more errors which can deteriorate a classifier’s performance when trained on this data. We propose a noise layer that is added to a neural network architecture. This allows modeling the noise and train on a combination of clean and noisy data. We show that in a low-resource NER task we can improve performance by up to 35% by using additional, noisy data and handling the noise.
🌉
Interdisciplinary Bridge
— Artificial Intelligence and Deep Learning and Machine Learning and Natural Language Processing
🐝
Cross-Pollinator
— Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio
Authors
Topics
Artificial Intelligence > Learning Paradigms > Transfer Learning
Machine Learning > Learning Types > Weakly Supervised Learning
Machine Learning > Application Areas > Domain Adaptation
Deep Learning > Architectures > Neural Networks
Natural Language Processing > Understanding > Named Entity Recognition
Natural Language Processing > Applications > Named Entity Recognition
Deep Learning > Learning Types > Deep Learning
Deep Learning > Learning Types > Weakly Supervised Learning
Machine Learning > Learning Types > Robustness