SegMAE: A Dual Decoder Framework with Patch Wise Constraint for Skin Lesion Segmentation

Authors: Jiacheng Huang, Haozhe Li, Gexian Liu, Gao Wang, and Keming Mao
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 299-314
Keywords: Skin Lesion Segmentation, Patch-wise Loss, Hybrid Training Strategy, Dual Decoder Architecture.

Abstract

Skin lesion segmentation remains a challenging task in medical image analysis. Although Transformer-based segmentation models have achieved notable progress in recent years, they still suffer from limitations such as the imbalance between local and global modeling, single-task architectural design, and insufficient attention to critical regions. These issues hinder their segmentation performance on complex skin lesion images. To address these challenges, we propose SegMAE, a dual-decoder segmentation framework that integrates image reconstruction and segmentation tasks to jointly enhance the model’s understanding of both global context and local details. The model adopts a CNN-Transformer hybrid encoder, with a MAE decoder for reconstruction and a Cascaded Upsampler for segmentation. To enhance the model’s performance and generalization, we design a two-stage training strategy that first involves. pretraining and then proceeds to hybrid multi-task training. In addition, we introduce a Patch-wise Loss function that adaptively emphasizes training on critical regions, thereby improving segmentation accuracy and robustness. Experimental results on ISIC2017, ISIC2018 and PH2 demonstrate that SegMAE consistently outperforms existing mainstream methods across multiple evaluation metrics, showcasing superior segmentation performance and strong generalization capability.
📄 View Full Paper (PDF) 📋 Show Citation