Towards Robust and Interpretable Event–Frame Fusion for Autonomous Driving

Dongyue Lu

2026 AAAI AAAI 2026

Towards Robust and Interpretable Event–Frame Fusion for Autonomous Driving

Abstract

Abstract Autonomous driving must handle motion blur, low light, and fast-changing scenes, where RGB frames and event cameras provide complementary strengths. This thesis explores how to fuse them across the perception–reasoning–planning pipeline. It introduces FlexEvent, a frequency-robust detector with adaptive fusion and label-efficient training; Talk2Event, the first benchmark for event–language grounding with attribute-aware modeling; and the EventDrive, an event–frame VLM covering the full driving loop. Together, these contributions advance robust perception, interpretable reasoning, and reliable planning for safety-critical driving through event–frame fusion.

🌉 Interdisciplinary Bridge — Artificial Intelligence and Computer Vision

🧭 Keyword Pioneer — rgb fusion

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Security & Privacy

Authors

Dongyue Lu

Topics

Artificial Intelligence > Core AI > Multimodal Learning Computer Vision > Domain-Specific > Autonomous Driving

Keywords

event camera autonomous driving vision language model rgb fusion event-frame fusion

Download PDF

Related papers

Hi-EF: Benchmarking Emotion Forecasting in Human-interaction 2026

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding 2026

Sparse3DPR: Training-Free 3D Hierarchical Scene Parsing and Task-Adaptive Subgraph Reasoning from Sparse RGB Views 2026

LayerEdit: Disentangled Multi-Object Editing via Conflict-Aware Multi-Layer Learning 2026

HDGS: Hierarchical Dynamic Gaussian Splatting for Urban Driving Scenes 2026