Spatiotemporal Transformers with Multiple Instance Learning for Label-Efficient Behavioral Analysis in Autism (Student Abstract)
Abstract
Abstract The identification of unique traits and behavior is essential to providing personalized intervention in individuals with Autism Spectrum Disorder. However, the limited personalized quantitative data with experts' annotations in autism research pose a fundamental challenge to train AI models for unique behavioral patten discovery. Multiple Instance Learning (MIL) has demonstrated promising results in medical domains, where annotations are only needed at the group level (i.e., a whole sequence) instead of individual data instances. It provides a cost-effective way to train statistical models with limited labeled data. Additionally, the rise of pretrained models have shown great success in improving the performance in few-shot learning scenarios. In this proof-of-concept study, we propose a novel framework that integrates a transformer encoder pre-trained on large-scale spatiotemporal data with MIL, for unique behavioral pattern detection from autistic individuals. Our results demonstrated the discrimination of individual-level autistic behavioral differences and the accurate classification of behaviors across distinct groups: typically developing (TD) and autistic (ASD). Beyond aggregate performance metrics, we highlight visual insights from temporal instance scores, revealing interpretable differences between individuals in their respective groups. These results show promising progress towards tools that can be used for personalized intervention for autistic individuals, and more interpretable AI diagnostics.