TMN: Bridging Modality Gap via Transition Modality Network for Visible-Infrared Person Re-Identification

Authors: Mengzhe Wang and Yuhao Wang
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 1943-1959
Keywords: Visible-infrared person re-identification, Transition Modality, Feature interaction and fusion

Abstract

Visible-infrared person re-identification VI-ReID task aims to match visible and infrared pedestrian images. However, due to the modality gap, VI-ReID task faces serious technical challenges. Existing methods have made significant progress but still suffer from two limitations: Unsupervised image generation methods are computational intensive and may introduce additional noise feature-level alignment struggles with designing effective loss functions for complex and abstract features output, resulting in insufficient learning and constraint of the model. To address these issues, we propose a Transition Modality Network TMN , which aims to construct a transitional modality between the two modalities, enabling early-stage cross-modality interaction at shallow network layers, thereby avoiding large com-putations and complex loss function design. First, the processed visible and infrared features are input into the Visible-infrared Transition Modality Fusion module VI-TMF to construct the transition modality. Secondly, we embed the Grouped Spatial-Channel Excitation block GSCE into the Resnet-50 for deep feature processing and extraction. Finally, we design a cross-modality bridging loss function to align the features of the three modalities. Through experiments on two benchmark datasets, TMN achieves Rank-1 mAP accuracy of 71.42 65.91 on the SYSU-MM01 dataset, and 92.14 83.25 on the RegDB dataset, demonstrating that transition modality construction effectively bridges cross-modality discrepancies and establishes a novel paradigm for addressing the fundamental challenges in VI-ReID tasks.
📄 View Full Paper (PDF) 📋 Show Citation