BGE-YOLO: An Improved YOLOv8 for Chinese Handwritten Text Detection

Authors: Rui Xiong, Shiyu Li, Xiuwei Yang, and Zhiwu Liao
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 2966-2983
Keywords: handwritten text, text detection, BGE-YOLO, BiFPN, Global At tention Module, EMA, self-constructed dataset.

Abstract

Handwritten text detection is a crucial step in converting handwritten text images into editable text. However, in practical applications, text detection still faces numerous challenges, including the complexity of environmental back grounds, diversity of target scales, and the contact between complex characters. To address these challenges, this paper proposes a BGE-YOLO model for hand written text detection. Firstly, a new feature fusion module is designed to achieve bidirectional information flow through cross-scale connections and rapid plan ning, ensuring effective integration of features across multiple scales. On this basis, a Global Attention Mechanism GAM is incorporated, which reduces in formation loss and amplifies the interaction of global dimensional features, ena bling the model to extract meaningful information in complex backgrounds. Ad ditionally, the incorporated Multi-Scale Attention EMA module utilizes a novel cross-spatial learning approach, enhancing the interaction of local features and further improving feature fusion efficiency. Furthermore, a data augmentation strategy enriches the self-constructed handwritten text image dataset, further im proving the model's generalization ability. Experimental results indicate that compared to the YOLOv8 model, the mAP50 and accuracy P of this model have increased by 2.8 and 3.9 , respectively. This validates the advantages of the BGE-YOLO model in handwritten text detection and facilitates more convenient information extraction from handwritten text.
📄 View Full Paper (PDF) 📋 Show Citation