南京大学学报(自然科学版)2024,Vol.60Issue(1):106-117,12.DOI:10.13232/j.cnki.jnju.2024.01.011
不完备数据集的邻域容差互信息选择集成分类算法
Neighborhood-tolerance mutual information selection ensemble classification algorithm for incomplete data sets
摘要
Abstract
In order to solve the classification problem of incomplete mixed information systems,the concept of neighborhood-tolerance mutual information is defined by combining neighborhood-tolerance and mutual information theory in granular computing,and a selective ensemble classification algorithm based on neighborhood-tolerance mutual information is proposed by using ensemble learning.In this algorithm,information particles are obtained according to the missing attributes,and the space is constructed by dividing the particles into different layers.A new base classifier is constructed by integrating the BP neural network as the base classifier on different layers.Then,the neighborhood-tolerance mutual information about class attributes is calculated according to the missing attributes of each information particle to measure the importance of each information particle,and the weight of the base classifier is redefined according to the prediction accuracy of the base classifier and the neighborhood-tolerance mutual information.Finally,based on the predicted samples,the weighted ensemble prediction results of base classifier are analyzed and compared with the traditional ensemble classification algorithm.For partial incomplete mixed data sets,the proposed ensemble classification algorithm can effectively improve the classification accuracy.关键词
不完备混合信息系统/邻域容差互信息/集成学习/分类Key words
incomplete hybrid information system/neighborhood-tolerance mutual information/ensemble learning/classification分类
信息技术与安全科学引用本文复制引用
李丽红,董红瑶,刘文杰,李宝霖,代琪..不完备数据集的邻域容差互信息选择集成分类算法[J].南京大学学报(自然科学版),2024,60(1):106-117,12.基金项目
河北省数据科学与应用重点实验室项目(10120201),唐山市数据科学重点实验室项目(10120301) (10120201)