Enhancing Container Damage Detection with improved YOLOv5 Model: Integrating Swin Transformer

Abstract

To improve the safety of port logistics transportation, container damage detection is critical. Container damage is diverse and includes small-scale object damage e.g., holes, dents, scratches . Traditional object detection algorithms used for container damage detection suffer from low accuracy and high miss rates for small-scale objects. This paper proposes an improvement to the YOLOv5 model based on the Transformer self-attention mechanism for container damage detection. To effectively capture global and long-range relationships in damage images, two layers of Swin Transformer blocks are added to the backbone network of YOLOv5. The PANet in YOLOv5 Neck has been optimized to BiFPN. Enhanced ability to fuse multi-scale features in damaged images while reducing computational complexity and information loss. Furthermore, use the Focaler-IoU Loss Function to improve the balance of features extracted from different samples in the dataset. The training set is clustered using the KMeans algorithm to obtain 9 initial anchor boxes more suitable for the container damage dataset. Experimental results on the COCO and Tianjin Port official container damage datasets validate that the improved model achieves an mAP of 95.4 . This outperforms common object detection algorithms such as Fast-RCNN and YOLOv5.

BibTeX Citation:

@inproceedings{ICIC2024,
    author = {Jiahao Chen, Chen Dong, Yuxuan Wan},
    title = {Enhancing Container Damage Detection with improved YOLOv5 Model: Integrating Swin Transformer},
    booktitle = {Proceedings of the 20th International Conference on Intelligent Computing (ICIC 2024)},
    month = {August},
    date = {5-8},
    year = {2024},
    address = {Tianjin, China},
    pages = {670-685},
    note = {Poster Volume Ⅰ}
    doi = {10.65286/icic.v20i1.41531}
}