| 注册
首页|期刊导航|深圳大学学报(理工版)|基于朴素贝叶斯集成的正类-无标签学习算法

基于朴素贝叶斯集成的正类-无标签学习算法

常秀颖 王晓兰 朱涛 何芃 欧桂良 何玉林

深圳大学学报(理工版)2026,Vol.43Issue(3):338-346,9.
深圳大学学报(理工版)2026,Vol.43Issue(3):338-346,9.DOI:10.3724/SP.J.1249.2026.03338

基于朴素贝叶斯集成的正类-无标签学习算法

NBEB-PUL:a naïve Bayesian ensemble-based algorithm for positive and unlabeled learning

常秀颖 1王晓兰 2朱涛 3何芃 4欧桂良 5何玉林5

作者信息

  • 1. 沧州职业技术学院华为ICT学院,河北 沧州 061001
  • 2. 沧州职业技术学院信息工程系,河北 沧州 061001
  • 3. 深圳大学计算机与软件学院,广东 深圳 518060||人工智能与数字经济广东省实验室(深圳),广东 深圳 518107
  • 4. 深圳大学计算机与软件学院,广东 深圳 518060
  • 5. 人工智能与数字经济广东省实验室(深圳),广东 深圳 518107
  • 折叠

摘要

Abstract

Existing ensemble-based positive and unlabeled learning(PUL)methods often fail to adequately account for the impact of noise in both positive and unlabeled samples during classifier construction.To address this issue,this paper proposes a naïve Bayesian ensemble-based PUL algorithm(NBEB-PUL),which consists of two stages:label assignment and noise filtering.In the label assignment stage,NBEB-PUL employs naïve Bayesian classifiers as base learners and integrates them using the AdaBoosting ensemble strategy to construct a strong classifier,named Ada-NBC.This strong classifier is then utilized to compute the mean posterior probabilities of a validation set for unlabeled samples,enabling explicit modeling of labeling uncertainty.Based on low-uncertainty predictions a set of reliable positive samples is iteratively identified,along with a residual unlabeled sample set.In the noise filtering stage,NBEB-PUL leverages the ensemble classifier generated in the first stage to reclassify the reliable positive set and the residual unlabeled set,resulting in Ada-NBC-predicted positive and negative sample sets.These sets are then intersected with the first-stage positive and unlabeled sets to extract overlapping high-confidence samples,forming the final positive and negative sample sets.The samples pruned from the dataset during this process are identified as noise.The feasibility,rationality,and effectiveness of NBEB-PUL were validated on 23 benchmark datasets from UCI and KEEL.Experimental results demonstrate that the algorithm exhibits stable convergence during the training process as the number of iterations increases.Moreover,NBEB-PUL outperforms six state-of-the-art PUL algorithms(S-EM,Biased-SVM,Modified-PUL,PU-LP,LP-PUL,and AdaPU)in terms of classification accuracy under varying positive sample proportions of 0.45,0.40,0.35,and 0.30.These results confirm that NBEB-PUL provides an effective and robust solution for positive and unlabeled learning in the presence of noise.

关键词

知识工程/正类-无标签学习/标注不确定性/后验概率/贝叶斯分类器/集成学习

Key words

knowledge engineering/positive and unlabeled learning/labeling uncertainty/posterior probability/Bayesian classifier/ensemble learning

分类

信息技术与安全科学

引用本文复制引用

常秀颖,王晓兰,朱涛,何芃,欧桂良,何玉林..基于朴素贝叶斯集成的正类-无标签学习算法[J].深圳大学学报(理工版),2026,43(3):338-346,9.

基金项目

Natural Science Foundation of Guangdong Province(2023A1515011667) (2023A1515011667)

Guangdong Basic and Applied Basic Research Foundation(2023B1515120020) (2023B1515120020)

Science and Technology Major Project of Shenzhen(KJZD20230923114809020) 广东省自然科学基金资助项目(2023A1515011667) (KJZD20230923114809020)

广东省基础与应用基础研究基金粤深联合基金重点资助项目(2023B1515120020) (2023B1515120020)

深圳市科技重大专项资助项目(KJZD20230923114809020) (KJZD20230923114809020)

深圳大学学报(理工版)

1000-2618

访问量1
|
下载量0
段落导航相关论文