FLoMo-Net: A Novel Task-Adaptive Mixture of Experts Routing Framework with Frequency and Uncertainty Correction for Medical Image Segmentation
Abstract
Medical image segmentation (MIS) is challenged by anatomical variability, ambiguous boundaries, and subtle textures, demanding an efficient balance between fine local details and global context. Existing architectures often suffer from suboptimal fusion of spatial and frequency-domain features, limiting their ability to capture richer structural and textural representations. To specifically address these challenges, we introduce FLoMo-Net, a modular MIS architecture organized around a principled route -> select -> refine -> correct pipeline: (1) The Local-Global Mixture of Experts encoder adaptively routes features across specialized convolutional branches to capture scale-appropriate context; (2) The Dual-Attention Selective Aggregator then jointly selects informative channels and spatial regions using frequency-guided modulation; (3) At the bridge, the Frequency-Aware Multi-Scale Refinement module refines edges and textures through explicit low/high-frequency decomposition; (4) Finally, the False Positive/Negative Corrective Attention Module leverages uncertainty, derived from entropy and cosine dissimilarity, to produce a residual corrective mask that suppresses semantic drift and improves boundary delineation in the decoder stages. Across four MIS benchmarks, FLoMo-Net achieves superior boundary-aware performance and faster inference with fewer parameters than prior state-of-the-art 2D MIS models. Code is publicly available at https://github.com/rayhan-ahmed91/FLoMo-Net.