AgentSeer: Visualizing and Evaluating Temporal Actions in Agentic AI Systems

Ilham Wicaksono; Zekun Wu; Rahul Patel; Theo King; Adriano Koshiyama; Philip Colin Treleaven

2026 AAAI AAAI 2026

AgentSeer: Visualizing and Evaluating Temporal Actions in Agentic AI Systems

Abstract

Abstract We present AgentSeer, an interactive observability framework for agentic AI systems. Unlike conventional tracing tools that expose raw spans or model-centric metrics, AgentSeer introduces a dual graph decomposition constructed through a deterministic rule-based parser: a temporal action graph, where each prompt or tool invocation is represented as a distinct action, and a component graph capturing architectural relations among agents, tools, and memory modules. Beyond visualization, AgentSeer enables action-level red teaming, where jailbreak payloads are systematically attached to every action node (including agent messages, tool calls, and memory retrievals) to uncover vulnerabilities invisible to model-level testing. Our demonstration features a six-agent hierarchical testbed with interactive visualization and deployment-oriented safety evaluation applied directly on the same prompts and contexts, systematically revealing high-risk interactions, context-dependent vulnerabilities, and emergent behaviors. By combining structured decomposition, automated red teaming, and rule-based reliability, AgentSeer establishes a safety-first methodology for observability in multi-agent AI.

🧭 Keyword Pioneer — temporal action graph

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio

Authors

Ilham Wicaksono , Zekun Wu , Rahul Patel , Theo King , Adriano Koshiyama , Philip Colin Treleaven

Topics

Artificial Intelligence > Core AI > Agent Systems Artificial Intelligence > Core AI > AI Safety Artificial Intelligence > Core AI > Interpretability

Keywords

red teaming multi-agent system agentic ai temporal action graph

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026