FSSNet: Frequency-Spatial Synergy Network for Universal Deepfake Detection

Authors: Zepeng Su and Yin Chen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 3013-3024
Keywords: Deep learning, Deepfake detection, Multi-modal fusion.

Abstract

In this paper, our aim is to develop a detector capable of effectively identifying previously unseen deepfake images, even with limited training data. Existing deepfake detection methods predominantly focus on single modality. For instance, frequency-domain approaches leverage Fourier transforms to capture frequency information, while spatial-domain methods utilize convolutional networks to extract visual features. However, relying on a single modality limits the ability to capture diverse feature types, resulting in poor generalization. To overcome this limitation, we propose a dual-stream network, FSSNet, which integrates the Scale-aware Bidirectional Cross Attention SBCA module and the Adaptive Feature Fusion AFF module for comprehensive and dynamic multi-modal feature fusion. Experimental results on deepfake images generated by eight unseen GAN models and ten unseen diffusion models demonstrate the superior performance of FSSNet, showcasing its robust generalization capability.
📄 View Full Paper (PDF) 📋 Show Citation