Chinese Medical Named Entity Recognition Based on Pre-trained Models and Multi-channel Feature Fusion

Authors: Xueliang Geng, Shihua Wang, Tianle Gao, Li Zhang, Ming Jing, Jiguo Yu
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: -
Keywords: Named entity recognition · Pre-trained language model · Multi-feature fusion · Medical data mining.

Abstract

Chinese medical named entity recognition(CMNER) is the basis for processing Chinese medical data. However, the structure of Chinese medical data is complex, and there are problems such as easy confusion of entity types, blurred entity boundaries, and irregular grammar, which brings difficulties and challenges to the task of CMNER. The current mainstream CMNER models usually use a single feature channel to extract features, which cannot make full use of the characteristics of text vectors, and the entity recognition effect is poor.Therefore,this paper proposes a Chinese medical named entity recognition method based on pre-training model and multi-channel feature fusion. The RoBERTawwm pre-training model is used to obtain word vectors of medical texts and enhance their semantic features. The obtained word vectors are passed through the BiGRU module extracts global features and the CNN module extracts multi-scale local features and fuses them. Finally, the fused features including enhanced semantic features, contextual global features and multi-scale local features are input into the Conditional Random Field (CRF) for entity recognition.Excellent results have been achieved on the Yidu-S4K dataset, Chinese diabetes annotation
dataset and cMedQANER dataset.Experimental results show that compared with many current models, the method proposed in this paper has better performance.
📄 View Full Paper (PDF) 📋 Show Citation