| 注册
首页|期刊导航|计算机应用研究|一种基于距离和采样机制的数据流分类方法

一种基于距离和采样机制的数据流分类方法

胡学钢 何俊宏 李培培

计算机应用研究2018,Vol.35Issue(4):992-995,1000,5.
计算机应用研究2018,Vol.35Issue(4):992-995,1000,5.DOI:10.3969/j.issn.1001-3695.2018.04.007

一种基于距离和采样机制的数据流分类方法

Data streams classification approach based on distance and sampling

胡学钢 1何俊宏 1李培培1

作者信息

  • 1. 合肥工业大学计算机与信息学院,合肥230009
  • 折叠

摘要

Abstract

Data stream classification is widely used in sensor networks,network monitoring and other real-world applications.However,the problem of class imbalance and label missing in data stream greatly aggravates the difficulty of data stream classification.Therefore,this paper proposed an ensemble classification method based on distance evaluation and sampling to solve the problem of incomplete labeled data stream classification with imbalanced class distribution.The proposed method first calculated the distance between the unlabeled data and the center point of the labeled data chunks to partition the positive and negative instances.Secondly,in order to balance the class distribution of the current data chunk,the data chunk was reconstructed by over-sampling positive instances and under-sampling negative instances,and then it was used to build an ensemble classification model.Experiments on the simulated incomplete labeled data stream with class imbalance show that the proposed method can improve the classification accuracy while reducing the influence of imbalanced class distribution as compared with the classical similar algorithm.

关键词

分类/集成学习/类分布不平衡/类标签缺失

Key words

classification/ensemble learning/class imbalance/label missing

分类

信息技术与安全科学

引用本文复制引用

胡学钢,何俊宏,李培培..一种基于距离和采样机制的数据流分类方法[J].计算机应用研究,2018,35(4):992-995,1000,5.

基金项目

国家重点研发计划项目(2016YFC0801406) (2016YFC0801406)

国家自然科学基金青年基金资助项目(61503112) (61503112)

国家自然科学基金资助项目(61673152) (61673152)

计算机应用研究

OA北大核心CSCDCSTPCD

1001-3695

访问量2
|
下载量0
段落导航相关论文