2024 AAAI AAAI 2024

Inconsistency-Based Data-Centric Active Open-Set Annotation

Abstract

Abstract Active learning, a method to reduce labeling effort for training deep neural networks, is often limited by the assumption that all unlabeled data belong to known classes. This closed-world assumption fails in practical scenarios with unknown classes in the data, leading to active open-set annotation challenges. Existing methods struggle with this uncertainty. We introduce NEAT, a novel, computationally efficient, data-centric active learning approach for open-set data. NEAT differentiates and labels known classes from a mix of known and unknown classes, using a clusterability criterion and a consistency mea- sure that detects inconsistencies between model predictions and feature distribution. In contrast to recent learning-centric solutions, NEAT shows superior performance in active open- set annotation, as our experiments confirm. Additional details on the further evaluation metrics, implementation, and archi- tecture of our method can be found in the public document at https://arxiv.org/pdf/2401.04923.pdf.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Deep Learning and Machine Learning
🧭 Keyword Pioneer — data-centric learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio