User Privacy Leakage in Text-based Recommendation

Authors: Zhiyong Wu, Hongguang Chen, and Yin Chen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 853-865
Keywords: Information Security, Recommendation System, Data Mining

Abstract

With the breakthrough development of large language models, the recom-mendation system is undergoing a transformation from the traditional rec-ommendation model based on the unique identity ID of users items pure ID-based model, IDRec to the recommendation model integrated with pre-trained modality encoder modality-based recommendation model, MoRec . This paradigm overcomes the cold start problem and enables the recommen-dation model to achieve cross-platform migration through pretraining. How-ever, the vector encoded by the modality encoder always contains more in-formation. Given this, a natural question arises: would MoRec suffer more security issue which could cause serious leakage of user historical behavior data compared to IDRec? We aim to explore this question and study modali-ty-based attack model in textual field. Specifically, we study several subquestions: i which recommendation paradigm, T-MoRec Textual-MoRec or IDRec, performs worse in protecting user privacy against attacks by the attack model? ii can the latest technical advances from NLP translate into attack improvement for T-MoRec? iii are there other factors affect tex-tual recommendation attack model? What's the proper setting to conduct the attack? To answer all this questions, we purpose Text-based Recommenda-tion Attack Model TRAM and conduct rigorous experiments with textual modality. We provide the first empirical on two public datasets, MIND and EB-NeRD, demonstrating that T-MoRec leads to serious leakage of user his-torical behavior data compared with IDRec under the same conditions. Addi-tionally, we show that the leakage is consistently influenced by the hyperpa-rameters and training cost in textual recommendation attack model.
📄 View Full Paper (PDF) 📋 Show Citation