深圳大学学报(理工版)2026,Vol.43Issue(3):338-346,9.DOI:10.3724/SP.J.1249.2026.03338
基于朴素贝叶斯集成的正类-无标签学习算法
NBEB-PUL:a naïve Bayesian ensemble-based algorithm for positive and unlabeled learning
摘要
Abstract
Existing ensemble-based positive and unlabeled learning(PUL)methods often fail to adequately account for the impact of noise in both positive and unlabeled samples during classifier construction.To address this issue,this paper proposes a naïve Bayesian ensemble-based PUL algorithm(NBEB-PUL),which consists of two stages:label assignment and noise filtering.In the label assignment stage,NBEB-PUL employs naïve Bayesian classifiers as base learners and integrates them using the AdaBoosting ensemble strategy to construct a strong classifier,named Ada-NBC.This strong classifier is then utilized to compute the mean posterior probabilities of a validation set for unlabeled samples,enabling explicit modeling of labeling uncertainty.Based on low-uncertainty predictions a set of reliable positive samples is iteratively identified,along with a residual unlabeled sample set.In the noise filtering stage,NBEB-PUL leverages the ensemble classifier generated in the first stage to reclassify the reliable positive set and the residual unlabeled set,resulting in Ada-NBC-predicted positive and negative sample sets.These sets are then intersected with the first-stage positive and unlabeled sets to extract overlapping high-confidence samples,forming the final positive and negative sample sets.The samples pruned from the dataset during this process are identified as noise.The feasibility,rationality,and effectiveness of NBEB-PUL were validated on 23 benchmark datasets from UCI and KEEL.Experimental results demonstrate that the algorithm exhibits stable convergence during the training process as the number of iterations increases.Moreover,NBEB-PUL outperforms six state-of-the-art PUL algorithms(S-EM,Biased-SVM,Modified-PUL,PU-LP,LP-PUL,and AdaPU)in terms of classification accuracy under varying positive sample proportions of 0.45,0.40,0.35,and 0.30.These results confirm that NBEB-PUL provides an effective and robust solution for positive and unlabeled learning in the presence of noise.关键词
知识工程/正类-无标签学习/标注不确定性/后验概率/贝叶斯分类器/集成学习Key words
knowledge engineering/positive and unlabeled learning/labeling uncertainty/posterior probability/Bayesian classifier/ensemble learning分类
信息技术与安全科学引用本文复制引用
常秀颖,王晓兰,朱涛,何芃,欧桂良,何玉林..基于朴素贝叶斯集成的正类-无标签学习算法[J].深圳大学学报(理工版),2026,43(3):338-346,9.基金项目
Natural Science Foundation of Guangdong Province(2023A1515011667) (2023A1515011667)
Guangdong Basic and Applied Basic Research Foundation(2023B1515120020) (2023B1515120020)
Science and Technology Major Project of Shenzhen(KJZD20230923114809020) 广东省自然科学基金资助项目(2023A1515011667) (KJZD20230923114809020)
广东省基础与应用基础研究基金粤深联合基金重点资助项目(2023B1515120020) (2023B1515120020)
深圳市科技重大专项资助项目(KJZD20230923114809020) (KJZD20230923114809020)