pLFL: A Lightweight Federated Learning Framework for Credit Risk Prediction

Authors: Yu Songsen, Yan Songlian, Qiu Miaosheng, and Pan Ming
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 3447-3464
Keywords: Federated Learning, Credit Risk, Lightweight, Risk Prediction

Abstract

A significant challenge faced by numerous small- and medium-sized banks in the field of credit risk prediction lies in the limitations of available data, high non-performing loan ratios, and stringent data privacy regulations. Federated learning FL offers a promising solution by enabling collaborative model training across multiple institutions without the need to share sensitive data, thus safeguarding privacy while enhancing the accuracy of credit risk predictions. This study focuses on borrower default prediction as a practical application scenario for small- and medium-sized banks and introduces a lightweight federated learning framework pLFL designed to optimize model performance. The proposed framework integrates an enhanced tADA data preprocessing technique with an improved pFed aggregation algorithm, effectively addressing the aforementioned challenges. To evaluate the efficacy of the pLFL framework, experiments were conducted on two real-world datasets. The results demonstrate substantial performance improvements: on the credit card dataset, the F1 score of the model increased to 81.5 , with Precision reaching 91.5 . On the Lending Club Loan Data dataset, communication overhead was significantly reduced, and the global model's convergence rate accelerated to 1.8 times its original speed. Furthermore, the pLFL framework incorporates parameter quantization and asynchronous communication strategies to minimize system resource consumption, underscoring its practicality for small- and medium-sized financial institutions. This research presents an efficient and privacy-preserving solution for credit risk prediction in the financial sector, particularly in scenarios requiring cross-institutional collaboration with heterogeneous data distributions.
📄 View Full Paper (PDF) 📋 Show Citation