计算机工程与应用2019,Vol.55Issue(4):1-16,16.DOI:10.3778/j.issn.1002-8331.1810-0420
不平衡数据挖掘方法综述
Survey on Imbalanced Data Mining Methods
摘要
Abstract
In recent years, the classification algorithms have made great progress. But as data sources continue to expand, most of the obtained data are unbalanced. These classification algorithms are usually sensitive to unbalanced data, so the classification of unbalanced data becomes very difficult. At present, the unbalanced data mining methods are mainly divided into two aspects, which are preprocessing methods and mining algorithms for unbalanced data. This paper summarizes the two aspects of the methods and makes a multi-dimensional combing from data preprocessing, algorithms and performance evaluation methods in recent years. Then, starting from different application fields, this paper describes all kinds of the unbalanced data problems, as well as the research and solutions of different scholars in their fields. Finally, the existing problems in the field of unbalanced data mining are analyzed, and the future research directions are prospected.关键词
不平衡数据/采样/聚类方法/集成方法/代价敏感/性能评估Key words
imbalanced data/ sampling/ cluster method/ ensemble method/ cost sensitive/ performance evaluation分类
信息技术与安全科学引用本文复制引用
向鸿鑫,杨云..不平衡数据挖掘方法综述[J].计算机工程与应用,2019,55(4):1-16,16.基金项目
国家自然科学基金(No.61663046,No.61876166) (No.61663046,No.61876166)
云南省应用基础研究计划项目(No.2016FB104) (No.2016FB104)
云南省中青年学术技术带头人后备人才项目(No.2017HB005) (No.2017HB005)
云南省创新团队项目(No.2017HC012). (No.2017HC012)