MDTH: A Multi-Scale Deep Learning Network for Steel Surface Defect Detection with Trans-Ham Feature Fusion

Authors: Yihong Wu, Yuhao Guo, Chen Yang, and Chao Zhang
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 2749-2764
Keywords: Defect detection Hybrid attention mechanism MPDIoU loss function Neu-ral network.

Abstract

In steel surface defect detection, accurately identifying various types of de-fects is essential. However, the diverse morphologies of defects and com-plex backgrounds encountered in real-world industrial production pose sig-nificant challenges for existing object detection networks. To overcome these issues, this paper presents a deep-learning-based network model named MDTH based on YOLOv10, which integrates multi-scale deep convolutional feature extraction with Swin Transformer encoding through an enhanced hy-brid attention mechanism Trans-HAM . Firstly, the Multi-Angle Perception and Depth-wise separable convolution module MAPD is employed to cap-ture the edges and texture details of steel surfaces, effectively identifying minor defects. Secondly, the Trans-HAM module extracts more comprehen-sive and fine-grained feature information, enabling the model to simultane-ously focus on both local details and global structures. Finally, MPDIoU is employed to optimize the overlap and shape matching of bounding boxes, improving the accuracy of defect localization. Experimental results on NEU-DET dataset and PKU-Market-PCB dataset show that the mAP@0.5 of the proposed MDTH model achieves a mean average precision of 81.2 and 95.3 , respectively, which greatly improves the detection accuracy, and the experimental results outperform those of the commonly used model.
📄 View Full Paper (PDF) 📋 Show Citation