2026 AAAI AAAI 2026

Hallucinations at the Firewall

Abstract

Abstract Generative AI shows strong capabilities in language, reasoning, and code but remains prone to hallucinations—outputs that are fluent yet incorrect. In cybersecurity, such errors pose serious risks, from misleading analysts to potential adversarial exploitation. This project investigates hallucinations in three directions: (1) creating benchmarks and interpretability tools to characterize them in security contexts; (2) developing mitigation strategies such as retrieval-augmented generation, symbolic-neural hybrids, and uncertainty-aware decoding; and (3) integrating these methods into real-world workflows like vulnerability assessment, malware analysis, and penetration testing, while exploring how attackers might exploit hallucinations. Evaluation will combine accuracy metrics, human-in-the-loop studies, and red-team simulations. By bridging theory and applied system design, the work aims to advance understanding of hallucinations and improve the reliability of AI in cybersecurity, with broader implications for other high-stakes areas such as healthcare and law.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Machine Learning
🧭 Keyword Pioneer — uncertainty-aware decoding
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy, Speech & Audio