From Cognitive Priors to Instance Semantics: A Unified Framework for Multi-task Affective Computing
Abstract
Understanding human affect via Valence-Arousal, Expressions, and Action Unit is essential for human-machine interaction. While recent multi-task learning (MTL) methods seek to unify these tasks, they overlook three key challenges: (i) the absence of unified modeling all three affective task types: regression, detection, and classification; (ii) reliance on complete annotations for all tasks, leaving disjoint single-task datasets underutilized; and (iii) task conflicts caused by Noisy Gradients, Negative Transfer (NT), and Task-specific Performance Misalignment (TPM). We introduce COIN, a novel two-stage MTL framework that bridges Cognitive Priors and Instance Semantics for robust training. First, we design a cognitively guided cross-task label induction strategy to propagate supervision under sparse annotations and mitigate NT, yielding strong task-specific CogXperts. Second, we introduce two complementary branches to address TPM: (i) Task-Specific Branch: transferring cognitive knowledge from task-optimal CogXperts to jointly optimize objectives under partial supervision, and (ii) Semantic Alignment Branch: enhancing instance-level semantic representations via Class-Conditioned and Instance-Adaptive Prompts. Experiments across six diverse datasets demonstrate COIN's robustness and generalization.