AdvDetectGPT: Detecting Adversarial Examples Using Large Vision-Language Models

Authors: Ming Zhang, Huayang Cao, and Cheng Qian

Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025

Pages: 802-812

Keywords: Adversarial detection, Adversarial examples, Deep neural networks, LVLMs.

DOI: 10.65286/icic.v21i1.20846

Abstract

Adversarial examples have been proven to be a substantial threat to the security applications of deep neural networks. Adversarial detection plays a pivotal role in defending against adversarial attacks. While the underlying concept is straightforward, the practical realization of adversarial detection is non-trivial, frequently encountering challenges of universality and effectiveness. In this study, we leverage the powerful capabilities of large vision-language models LVLMs and develop AdvDetectGPT, a novel adversarial detector based on LVLMs. AdvDetectGPT can learn to identify adversarial examples directly from clean and adversarial instances, independent of the victim model's outputs or internal responses. The extensive experiments show that AdvDetectGPT significantly outperforms the state-of-the-art baselines. AdvDetectGPT exhibits robust generalization, capable of detecting adversarial examples crafted by novel attacks on new models, as well as those with customized perturbations distinct from the training set. Code is available at https: github.com mingcheung AdvDetectGPT.

BibTeX Citation:

@inproceedings{ICIC2025,
    author = {Ming Zhang, Huayang Cao, and Cheng Qian},
    title = {AdvDetectGPT: Detecting Adversarial Examples Using Large Vision-Language Models},
    booktitle = {Proceedings of the 21st International Conference on Intelligent Computing (ICIC 2025)},
    month = {July},
    date = {26-29},
    year = {2025},
    address = {Ningbo, China},
    pages = {802-812},
    note = {Poster Volume Ⅰ}
    doi = {10.65286/icic.v21i1.20846}
}