Transformer-Based Video-Structure Multi-Instance Learning for Whole Slide Image Classification

Yingfan Ma; xiaoyuan luo; Kexue Fu; Manning Wang

2024 AAAI AAAI 2024

Transformer-Based Video-Structure Multi-Instance Learning for Whole Slide Image Classification

Abstract

Abstract Pathological images play a vital role in clinical cancer diagnosis. Computer-aided diagnosis utilized on digital Whole Slide Images (WSIs) has been widely studied. The major challenge of using deep learning models for WSI analysis is the huge size of WSI images and existing methods struggle between end-to-end learning and proper modeling of contextual information. Most state-of-the-art methods utilize a two-stage strategy, in which they use a pre-trained model to extract features of small patches cut from a WSI and then input these features into a classification model. These methods can not perform end-to-end learning and consider contextual information at the same time. To solve this problem, we propose a framework that models a WSI as a pathologist's observing video and utilizes Transformer to process video clips with a divide-and-conquer strategy, which helps achieve both context-awareness and end-to-end learning. Extensive experiments on three public WSI datasets show that our proposed method outperforms existing SOTA methods in both WSI classification and positive region detection.

🌉 Interdisciplinary Bridge — Computer Vision and Deep Learning and Machine Learning

🧭 Keyword Pioneer — video-structure transformer

🐝 Cross-Pollinator — Artificial Intelligence, Computer Science, Computer Vision, Data Science & Analytics, Deep Learning, Healthcare & Medicine, Interdisciplinary, Knowledge & Reasoning, Machine Learning, Mathematics & Optimization, Natural Language Processing, Reinforcement Learning, Speech & Audio

Authors

Yingfan Ma , xiaoyuan luo , Kexue Fu , Manning Wang

Topics

Machine Learning > Core Methods > Classification Deep Learning > Architectures > Transformers Computer Vision > Domain-Specific > Medical Imaging Machine Learning > Learning Types > Multi-Instance Learning Deep Learning > Models > Transformers

Keywords

multi-instance learning medical image classification whole slide image digital pathology pathological image video-structure transformer cancer diagnosis

Download PDF

Related papers

Goal Alignment: Re-analyzing Value Alignment Problems Using Human-Aware AI 2024

Meta-Inverse Reinforcement Learning for Mean Field Games via Probabilistic Context Variables 2024

Suppressing Uncertainty in Gaze Estimation 2024

Mask-Homo: Pseudo Plane Mask-Guided Unsupervised Multi-Homography Estimation 2024

Heterogeneous Test-Time Training for Multi-Modal Person Re-identification 2024