2026 WACV WACV 2026

PSA-MIL: A Probabilistic Spatial Attention-Based Multiple Instance Learning for Whole Slide Image Classification

Abstract

Whole Slide Images (WSIs) are high-resolution digital scans widely used in medical diagnostics. Due to their immense size, WSI classification is typically approached using Multiple Instance Learning (MIL), where a slide is partitioned into individual tiles, disrupting its spatial structure. Recent MIL methods often incorporate spatial context through rigid spatial assumptions (e.g. fixed kernels), which limit their ability to capture the intricate tissue structures crucial for an accurate diagnosis. To address this limitation, we propose Probabilistic Spatial Attention MIL (PSA-MIL), a novel attention-based MIL framework that integrates spatial context into the attention mechanism through learnable distance-decayed priors, formulated within a probabilistic interpretation of self-attention as a posterior distribution. This formulation enables a dynamic inference of spatial relationships during training, eliminating the need for predefined assumptions often imposed by previous approaches. Furthermore, we introduce a diversity loss that promotes complementary spatial representations across attention heads and a spatial posterior-pruning strategy that reduces computational cost for long WSI sequences while preserving performance. Extensive experiments across multiple datasets and tasks show that PSA-MIL outperforms current baselines and achieves state-of-the-art results with substantially lower computational overhead. Our code is available at https://github.com/SharonPeled/PSA-MIL.

🌉 Interdisciplinary Bridge — Computer Vision and Machine Learning
🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Robotics, Speech & Audio