A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application

Maeda F. Hanafi; Yannis Katsis; Martín Santillán Cooper; Yunyao Li

2022 AAAI AAAI 2022

A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application

Abstract

Abstract Interactive AI (IAI) systems are increasingly popular as the human-centered AI design paradigm is gaining strong traction. However, evaluating IAI systems, a key step in building such systems, is particularly challenging, as their output highly depends on the performed user actions. Developers often have to rely on limited and mostly qualitative data from ad-hoc user testing to assess and improve their systems. In this paper, we present InteractEva; a systematic evaluation framework for IAI systems. We also describe how we have applied InteractEva to evaluate a commercial IAI system, leading to both quality improvements and better data-driven design decisions.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning

🐣 Hot Topic Early Bird — evaluation framework

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Maeda F. Hanafi , Yannis Katsis , Martín Santillán Cooper , Yunyao Li

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > Human-AI Interaction Machine Learning > Application Areas > Efficient Computing Machine Learning > Optimization & Theory > Evaluation Machine Learning > Learning Types > Evaluation

Keywords

evaluation framework human-centered ai interactive ai user simulation quantitative evaluation simulation-based evaluation user testing

Download PDF

Related papers

Dynamic Spatial Propagation Network for Depth Completion 2022

FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition 2022

Memory-Guided Semantic Learning Network for Temporal Sentence Grounding 2022

AnchorFace: Boosting TAR@FAR for Practical Face Recognition 2022

Parallel and High-Fidelity Text-to-Lip Generation 2022