When AI Difficulty Is Easy: The Explanatory Power of Predicting IRT Difficulty

Fernando Martínez-Plumed; David Castellano; Carlos Monserrat-Aranda; José Hernández-Orallo

2022 AAAI AAAI 2022

When AI Difficulty Is Easy: The Explanatory Power of Predicting IRT Difficulty

Abstract

Abstract One of challenges of artificial intelligence as a whole is robustness. Many issues such as adversarial examples, out of distribution performance, Clever Hans phenomena, and the wider areas of AI evaluation and explainable AI, have to do with the following question: Did the system fail because it is a hard instance or because something else? In this paper we address this question with a generic method for estimating IRT-based instance difficulty for a wide range of AI domains covering several areas, from supervised feature-based classification to automated reasoning. We show how to estimate difficulty systematically using off-the-shelf machine learning regression models. We illustrate the usefulness of this estimation for a range of applications.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Fernando Martínez-Plumed , David Castellano , Carlos Monserrat-Aranda , José Hernández-Orallo

Topics

Artificial Intelligence > Core AI > Interpretability Machine Learning > Core Methods > Regression Machine Learning > Learning Types > Evaluation Machine Learning > Learning Types > Robustness

Keywords

model evaluation item response theory explainable ai adversarial example instance difficulty ai evaluation

Download PDF

Related papers

Dynamic Spatial Propagation Network for Depth Completion 2022

FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition 2022

Memory-Guided Semantic Learning Network for Temporal Sentence Grounding 2022

AnchorFace: Boosting TAR@FAR for Practical Face Recognition 2022

Parallel and High-Fidelity Text-to-Lip Generation 2022