Mixed feature processing model for Few-short Object Detection

Authors: Qian Jiang,Shuting Li,Dakai Sun,Biaohua Liu,Shengfa Miao,Yu Wang,Xin Jin
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: 147-160
Keywords: Few-Shot Object Detection,Feature Pyramid Network,Variational Autoencoder

Abstract

Traditional object detection methods typically require large-scale annotated training data. However, in some areas, acquiring a large amount of annotated data can be extremely challenging. To address the issue of Few-Shot Object Detection FSOD , researchers have introduced the concept of meta-learning, Currently, meta-learning is widely applied in two-stage object detection. We have identified several key issues affecting the accuracy of FSOD, including limited data, insufficient feature extraction capabilities, and the aggregation method between different features. To more finely extract features and better aggregate features, we separate the support branch and query branch of Meta-RCNN, forming two parallel branches. We create one mixed feature processing model for few shot object detection, we put the Feature Pyramid Network FPN only into the backbone network of the query branch, creating a strong baseline to enhance the extraction capabilities for images of different dimensions. Additionally, for the first time in FSOD, we use a Variational Autoencoder VAE model to extract features. Which achieves data augmentation and improves the generalisation ability of the network by adding the VAE to the support branch to obtain more useful information in the support set.In addition to this, we design a module $R$ to aggregate the output support image features with the query image features on the query branch, the aggregated results are fed into the detection head of the object detection process.Experimental results demonstrate that the proposed method exhibits good performance. Following the experimental settings for FSOD, we conducted extensive experiments on the PASCAL VOC dataset, showing that our method is superior to other methods currently available and achieves very satisfactory results.
📄 View Full Paper (PDF) 📋 Show Citation