From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

Dimitris Tsipras; Shibani Santurkar; Logan Engstrom; Andrew Ilyas; Aleksander Madry

2020 ICML ICML 2020

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

Abstract

Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset—including the introduction of biases that state-of-the-art models exploit. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for. Finally, our findings emphasize the need to augment our current model training and evaluation toolkit to take such misalignment into account.

🐣 Hot Topic Early Bird — dataset bia

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Dimitris Tsipras , Shibani Santurkar , Logan Engstrom , Andrew Ilyas , Aleksander Madry

Topics

Machine Learning > Core Methods > Classification Machine Learning > Optimization & Theory > Statistical Learning Machine Learning > Application Areas > Fairness

Keywords

benchmark evaluation dataset bia data collection

Download PDF

Related papers

Correlation Clustering with Asymmetric Classification Errors 2020

Learning Portable Representations for High-Level Planning 2020

Proving the Lottery Ticket Hypothesis: Pruning is All You Need 2020

Minimax Pareto Fairness: A Multi Objective Perspective 2020

DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training 2020