High Utility Pattern Fusion by Pretrained Language Models for Text Classification

Authors: Yujia Wu1, Hong Ren
Conference: ICIC 2024 Posters, Tianjin, China, August 5-8, 2024
Pages: 339-350
Keywords: Text Classification, Transformer Encoders, High Utility Pattern, Pre-trained Language Models

Abstract

In the area of text classification, the identification of correlation patterns among semantics presents a persistent challenge. To tackle this issue, we propose a method called High Utility Pattern HUP fusion by Pretrained Language Models for Text Classification, which aims to enhance the performance of text classifica-tion techniques by learning correlation patterns among semantics within the same space. Specifically, HUP employs a Triplet Networks architecture, which utilizes three distinct encoders to extract sample semantics, correlation pattern infor-mation, and label semantic information, respectively. We employ a high-utility itemset mining algorithm to extract correlation pattern information with high utili-ty, and by incorporating prompt templates into labels, the model is able to fully leverage the semantic knowledge embedded in pre-trained models. Ultimately, through joint training, the distance between a sample and its corresponding label is minimized, while the distance between the sample and labels that are not asso-ciated with the sample is maximized. Empirical investigations conducted on six standard text classification datasets reveal that the classification accuracy of HUP exhibits a notable enhancement, with an average accuracy increase ranging from 1.52 to 89.08 .
📄 View Full Paper (PDF) 📋 Show Citation