Prototype-based Bilevel Knowledge Distillation for Online Continual Learning

Authors: Xiaochen Yang
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 635-649
Keywords: Continual Learning, Knowledge Distillation, Prototype Learning.

Abstract

In Online Continual Learning OCL , all samples arrive sequentially and are seen only once, posing a challenge in balancing the learning of new tasks with the retention of old task knowledge. Traditional methods often ignore the protection of previously learned knowledge while learning new tasks, leading to catastrophic forgetting. On the other hand, some methods focus on minimizing the forgetting of previous knowledge, which hinders the model’s ability to effectively learn new knowledge. To address the balance between learning new tasks and preserving old knowledge, we propose a new framework—Prototype-based Bilevel Knowledge Distillation PBKD . By incorporating hierarchical prototypes and bilevel distillation mechanisms, PBKD enhances the model's ability to distinguish between classes through personalized feature representations and dynamically adjusts the knowledge transfer between teacher and student models. This approach allows for the effective retention of old task knowledge while improving the model’s capacity to learn new tasks. Extensive experimental results demonstrate that PBKD achieves a more favorable combination of accuracy and forgetting rate on three benchmark datasets, validating its effectiveness in addressing the knowledge learning and forgetting issue in OCL.
📄 View Full Paper (PDF) 📋 Show Citation