计算机与数字工程Issue(9):2155-2160,6.DOI:10.3969/j.issn.1672-9722.2019.09.010
不平衡数据的下采样方法研究∗
Under-sampling Method Research in Class-Imbalanced Data
摘要
Abstract
Imbalanced dataset have a serious impact on the performance of classifiers in machine learning. This paper propos?es a under-sampling method based on Gaussian mixture model. The method exploits Gaussian mixture model to fit negative data, then gets under-sampling in proportion based on probability interval that is the distribution of data on each Gaussian component.The method achieves the class-imbalance by reducing the number of samples that belong to majority classes,at the same time,it main?tains the data distribution of majority classes.The experimental results on six groups UCI imbalanced dataset show the under-sam?pling method can effectively improve the classification performance on imbalanced dataset.关键词
不平衡学习/下采样/高斯混合模型/机器学习Key words
imbalanced learning/under-sampling/gaussian mixture model/machine learning分类
信息技术与安全科学引用本文复制引用
周建伟..不平衡数据的下采样方法研究∗[J].计算机与数字工程,2019,(9):2155-2160,6.基金项目
国家自然科学基金项目(编号:61772273)资助. (编号:61772273)