CLUE: A High-Performance, Efficient, and Robust APT Detection Framework via Fine-Tuning Pretrained Transformer and Contrastive Learning

Authors: Wenzhuo Cui, Maihao Guo, Jingjing Feng, Shuyi Zhang, Zheng Liu, and Yu Wen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 994-1008
Keywords: APT detection, Provenance graph, Contrastive learning, Pretrained Transformer

Abstract

In recent years, provenance graph-based approaches have become the standard approach for Advanced Persistent Threat APT detection and investigation. However, existing studies face several challenges: 1 the high computational cost of the training process makes it difficult to update the model in a timely manner, leading to delayed attack detection 2 the imbalance in training data results in a scarcity of attack samples, which negatively impacts model performance and 3 high false positive rates hinder practical deployment in real-world applications. To address these challenges, we propose CLUE, a novel APT detection framework that enables high-quality, multi-granular detection. CLUE employs lemmatization techniques to normalize sequence data extracted from provenance graphs and then directly fine-tunes a pretrained Transformer model, significantly reducing both training time and dependence on scarce attack data. Furthermore, CLUE incorporates contrastive learning to enhance generalization capability in data-scarce scenarios by optimizing inter-sample distances while accelerating model convergence. Our evaluation of CLUE across 10 real-world APT attack scenarios demonstrates that compared to state-of-the-art methods, CLUE maintains superior detection performance while achieving a 7.4× reduction in average training time and requiring 45.2 less training data particularly attack samples . These results validate CLUE's efficiency, robustness, and practical value in APT detection.
📄 View Full Paper (PDF) 📋 Show Citation