DTS-YOLO: Enhancing Object Detection via Dynamic Routing, Texture Encoding, and Semantic Fusion

Authors: Shipeng Zheng, Sen Zhang, and Yulin Chen
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 3307-3323
Keywords: Object detection, multi-scale fusion, texture encoding, semantic attention, bounding box regression.

Abstract

To address limited feature representation, insufficient cross-scale fusion, and localization inaccuracies in complex scenes, we propose DTS-YOLO, a lightweight single-stage detector. It improves detection through dynamic fea-ture aggregation, fine-grained texture encoding, and precise bounding box regression. Specifically, the Dynamic Route Enhanced Aggregation Module DREAM integrates multi-branch depthwise convolutions and lightweight Transformers to enrich multi-scale representations. To mitigate semantic in-consistency in fusion, Dynamic Cross-scale Feature Fusion DCFF com-bines Scale-aware Channel Attention Fusion SCAF and Intra-layer Feature Fusion Attention IFFA for enhanced semantic alignment. Additionally, edge and texture perception is reinforced via Sobel and Laplacian Pyramid modules. For robust localization, a novel Closed Complete IoU CCIoU loss introduces morphological closure operations to refine bounding box align-ment under occlusion. Experiments on VisDrone2019 and DOTA-v1.5 HBB demonstrate consistent performance gains over baseline YOLO11, especially for small and dense objects in complex environments.
📄 View Full Paper (PDF) 📋 Show Citation