Vision Mamba UNet+: an Improved Multi-Organ Segmentation Method Based on State-Space Model

Authors: Song Shen, Haohan Ding, Xiaohui Cui, Yicheng Di, Long Wang, and Wancheng He
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 315-326
Keywords: medical image segmentation vision mamba state space models

Abstract

In the domain of multi-organ segmentation for medical imaging, considerable advancements have been achieved through the application of Convolutional Neural Networks CNNs and Transformer-based architectures. While CNNs excel in local feature extraction, their inherently small receptive fields limit their capacity to capture global context. Conversely, Transformers, with their ability to model global dependencies, offer superior performance in this regard, but their computational demands, particularly for high-resolution medical images, present significant challenges. To address these limitations, this study proposes Vision Mamba UNet_, an optimized architecture rooted in the Mamba framework. Vision Mamba UNet_ effectively balances the extraction of both local and global information while substantially reducing computational overhead. The model leverages components from VMamba and Vision Mamba encoders, structured around a ā€˜U’ -shaped encoderdecoder framework that incorporates skip connections and multi-scale feature fusion to maximize performance. Experimental evaluations on the Synapse dataset demonstrate that Vision Mamba UNet_ achieves superior computational efficiency and segmentation accuracy, underscoring its promise for application in complex medical image segmentation tasks.
šŸ“„ View Full Paper (PDF) šŸ“‹ Show Citation