Scoring Workers in Crowdsourcing: How Many Control Questions are Enough?

Qiang Liu; Alex Ihler; Mark Steyvers

2013 NIPS NeurIPS 2013

Scoring Workers in Crowdsourcing: How Many Control Questions are Enough?

Abstract

We study the problem of estimating continuous quantities, such as prices, probabilities, and point spreads, using a crowdsourcing approach. A challenging aspect of combining the crowd's answers is that workers' reliabilities and biases are usually unknown and highly diverse. Control items with known answers can be used to evaluate workers' performance, and hence improve the combined results on the target items with unknown answers. This raises the problem of how many control items to use when the total number of items each workers can answer is limited: more control items evaluates the workers better, but leaves fewer resources for the target items that are of direct interest, and vice versa. We give theoretical results for this problem under different scenarios, and provide a simple rule of thumb for crowdsourcing practitioners. As a byproduct, we also provide theoretical analysis of the accuracy of different consensus methods.

❓ The Questioner

🌉 Interdisciplinary Bridge — Data Science & Analytics and Machine Learning

🧭 Keyword Pioneer — consensus methods

🐝 Cross-Pollinator — Artificial Intelligence, Data Science & Analytics, Deep Learning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Security & Privacy

🌱 Topic Pioneer — Estimation

Authors

Qiang Liu , Alex Ihler , Mark Steyvers

Topics

Machine Learning > Core Methods > Clustering Machine Learning > Learning Types > Active Learning Machine Learning > Optimization & Theory > Statistical Learning Machine Learning > Optimization & Theory > Theory Data Science & Analytics > Applications > Clustering Data Science & Analytics > Applications > Recommender Systems Machine Learning > Learning Types > Supervised Learning Machine Learning > Optimization & Theory > Statistics Machine Learning > Learning Types > Crowdsourcing Machine Learning > Core Methods > Estimation

Keywords

crowdsourcing statistical estimation worker reliability consensus methods control questions quality control worker evaluation

Download PDF

Related papers

Latent Structured Active Learning 2013

On Flat versus Hierarchical Classification in Large-Scale Taxonomies 2013

Generalized Method-of-Moments for Rank Aggregation 2013

Third-Order Edge Statistics: Contour Continuation, Curvature, and Cortical Connections 2013

Accelerated Mini-Batch Stochastic Dual Coordinate Ascent 2013