基于对抗数据增强的非平衡节点分类算法OA北大核心CSTPCD
Adversarial data augmentation algorithm for imbalanced node classification
图神经网络(Graph Neural Networks,GNNs)在节点分类任务中取得了显著的成功,然而,目前的GNNs模型倾向于处理具有大量标记数据的多数类,较少关注标记较少的少数类,传统方法常通过过采样来解决这一问题,但可能会导致过拟合.近期的一些研究提出了从标记节点合成少数类附加节点的方法,但对于生成的节点是否真正代表相应的少数类,没有明确保证,实际上,不正确的合成节点可能导致算法的泛化能力不足.为了解决这一问题,提出一种基于对抗训练的简单自监督数据增强方法GraphA2,通过在少数类周围的平滑空间中对梯度最远的地方施加扰动来增强数据,同时采用对比学习来保证增强后的一致性.使用这种方法,不仅增强了数据的多样性,还确保了模型在整个空间中的平滑性和连贯性,能增强其泛化能力.实验表明,提出的方法在各种类别不平衡的数据集上的性能均优于目前最先进的基准模型.
Graph Neural Networks(GNNs)have achieved notable success in node classification tasks.However,current GNN models tend to focus on majority classes with a large amount of labeled data,paying little attention to minority classes with fewer labels.Traditional methods often address this issue through oversampling,which may lead to overfitting.Some recent studies suggest synthesizing additional nodes for minority classes from labeled nodes,yet there's no clear guarantee that these generated nodes truly represent the corresponding minority classes.In fact,incorrect synthetic nodes may undermine the generalization ability of the algorithm.To address this issue,this paper introduces a simple,self-supervised data augmentation method based on adversarial training,Graph A 2,which enhances the data by adding perturbations at the farthest gradient space around minority classes while using contrastive learning to ensure consistency after augmentation.This approach not only increases the diversity of data but also ensures smoothness and coherence across the entire space,thereby enhancing generalization capability.The experiments show that this method outperforms the current state-of-the-art baseline models on various imbalanced datasets.
程凤伟;王文剑;史颖;张珍珍
太原学院计算机科学与技术系,太原,030032山西大学智能信息处理研究所,太原,030006||山西警察学院网络安全保卫系,太原,030401太原师范学院计算机科学与技术学院,太原,030619
计算机与自动化
图神经网络节点分类非平衡数据过采样对抗数据增强
graph neural networksnode classificationimbalance dataoversamplingadversarial data augmentation
《南京大学学报(自然科学版)》 2024 (005)
785-792 / 8
国家自然科学基金(U21A20513,62076154),山西省重点研发计划(202202020101003),山西省高等学校科技创新项目(2024L382)
评论