Cause-based Supervised Contrastive Learning with Adversarial Sample-label for Multimodal Emotion Recognition in Conversation

Authors: Yujie Guan, Yu Wang, Weijie Feng, Tingyi Li, and Xiao Sun
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 882-897
Keywords: Class Imbalance, Supervised Contrastive Learning, Emotion Recognition in Conversation.

Abstract

Multimodal Emotion Recognition in Conversation MERC holds significant importance in Natural Language Processing due to its broad range of applications. However, existing methods still face challenges in addressing the highly imbalanced class problem and extracting robust representations for complex conversational scenarios. which leads to a decrease in the generalization ability of the model and an inability to effectively recognize minority emotion classes. To address these challenges, a cause-based supervised contrastive learning framework with adversarial sample-label CaSCLA is proposed in this paper. Specifically, we employ a modality balancing technique to fuse the multimodal features, which are then fed into a novel causal-aware network to effectively capture the underlying causal relationships within dialogues. Besides, a supervised contrastive learning with adversarial sample-label method is proposed to alleviate the class imbalance problem by learning label representations and optimizing the similarity between sample features and label embeddings. Furthermore, CaSCLA applies an adversarial samples training strategy, constructing additional positive sample-label pairs to enhance the diversity of the data and increase the robustness of the model. Extensive experiments on the IEMOCAP and MELD benchmark datasets demonstrate that CaSCLA achieves competitive performance.
📄 View Full Paper (PDF) 📋 Show Citation