Academic Institution Name Recognition Based on Representation Learning and Sematic Matching

Authors: Jinyu Wang and Zhijie Ban
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: 516-528
Keywords: Academic Institution Name Recognition, Representation Learning, Semantic Matching.

Abstract

Recognition of scholar’s institution name has been extensively researched for accurately parsing academic papers. Existing rule-based methods are primarily applicable in the cases where the writing styles of the organization name are regular. Other approaches need to pre-establish a knowledge base for mapping organization names, which demands considerable human resources. This paper presents a method based on representation learning and semantic matching, primarily leveraging institution’s textual information and academic network’s structure. We first construct an author-institution heterogeneous graph, on which maximal random walk and Word2Vec are used to obtain representation vectors for institution nodes. Then, we convert institution names into semantic vectors by the SimCSE model and institution candidate sets are generated by employing the locality sensitive hashing algorithm. Finally, in order to avoid setting the cluster number, we propose a connected subgraph partitioning method to divide institution clusters. Experimental results on two real datasets demonstrate that our method significantly outperforms the existing state-of-the-art recognition methods.
📄 View Full Paper (PDF) 📋 Show Citation