DMKAN-Net: A Dual-Modal Fusion and KAN Decoder Network for RGB-D Salient Object Detection

Authors: Qingbo Xue, Panpan Zheng, and Liejun Wang
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 529-545
Keywords: GB-D, SOD, Dual-model, KAN.

Abstract

At present, most RGB-D saliency target detection algorithms SOD are committed to investing a lot of computing power in the encoding and feature fusion stages. Admittedly, this strategy brings the performance improvement, but it also ignores the feature recovery ability and the fitting ability in the decoding stage. Therefore, we designed a Dual-modal fusion and KAN decoder network called DMKAN-Net to better implement the RGB-D SOD task. This network structure has only three parts, dual-stream encoder, dual-mode converter and KAN decoder. Among them, the dual-stream encoder adopts the Swin Trans former encoder, which is mainly used to extract the multilevel and global features in the RGB and depth images. In the dual-mode fusion section, we design a dual mode feature fusion module to capture the channel information and spatial infor mation in different modes and fuse it. KAN decoder is the decoder mainly com posed of KAN module, which uses nonlinear and learnable activation function to better recover and predict saliency targets. Moreover, experiments performed on five benchmark datasets show that our method achieves competitive results.
📄 View Full Paper (PDF) 📋 Show Citation