Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types

Paul Felt; Eric Ringger; Jordan Boyd-Graber; Kevin Seppi

2018 COLING COLING 2018

Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types

Abstract

AbstractAnnotated corpora enable supervised machine learning and data analysis. To reduce the cost of manual annotation, tasks are often assigned to internet workers whose judgments are reconciled by crowdsourcing models. We approach the problem of crowdsourcing using a framework for learning from rich prior knowledge, and we identify a family of crowdsourcing models with the novel ability to combine annotations with differing structures: e.g., document labels and word labels. Annotator judgments are given in the form of the predicted expected value of measurement functions computed over annotations and the data, unifying annotation models. Our model, a specific instance of this framework, compares favorably with previous work. Furthermore, it enables active sample selection, jointly selecting annotator, data item, and annotation structure to reduce annotation effort.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Natural Language Processing

🐣 Hot Topic Early Bird — probabilistic modeling

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Paul Felt , Eric Ringger , Jordan Boyd-Graber , Kevin Seppi

Topics

Artificial Intelligence > Bayesian & Probabilistic > Probabilistic Modeling Natural Language Processing > Resources & Methods > Text Representation

Keywords

active learning probabilistic modeling annotation aggregation

Download PDF

Related papers

DialEdit: Annotations for Spoken Conversational Image Editing 2018

Downward Compatible Revision of Dialogue Annotation 2018

Zero Pronoun Resolution with Attention-based Neural Network 2018

Triad-based Neural Network for Coreference Resolution 2018

Challenges of language technologies for the indigenous languages of the Americas 2018