TUPL: Text-guided Unknown Pseudo-labeling for Open World Object Detection

Authors: Xuefei Wang;Guangtai Ding
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: 127-144
Keywords: Open world object detection, Cross-modal learning, Pseudo-labeling.

Abstract

Open-world object detection (OWOD) aims to enhance the performance of a trained model in real-world environments. A crucial requirement is to detect unknown objects even with only bits of known class labels. Existing OWOD models that rely on objectness scores to choose unknown pseudo candidates often suffer from sub-optimal performance due to bias issue towards known classes. Leveraging the inherent advantage of text, which contains natural high-level semantic information, we incorporate textual data into the process of generating pseudo-labels for unknown class objects. By combining random selection to further mitigate the bias problem, we present a concise yet powerful text-guided pseudo-labeling approach for OWOD, named TUPL. To further facilitate the model in distinguishing all foreground objects in an image, we design an ROI feature refinement module to assist the model in learning distinctive foreground features. Experiments conducted on the PASCAL VOC and MS-COCO evaluation benchmarks demonstrate TUPL’s exceptional open-world detection capability. Specifically, under the OWOD SPLIT setting, TUPL achieves a UR (Unknown Recall) value of 23.1, which is at least twice as high as that of existing pseudo-labeling methods based on objectness scores.
📄 View Full Paper (PDF) 📋 Show Citation