| 注册
首页|期刊导航|计算机工程|改进型加权KNN算法的不平衡数据集分类

改进型加权KNN算法的不平衡数据集分类

王超学 潘正茂 马春森 董丽丽 张涛

计算机工程2012,Vol.38Issue(20):160-163,168,5.
计算机工程2012,Vol.38Issue(20):160-163,168,5.DOI:10.3969/j.issn.1000-3428.2012.20.041

改进型加权KNN算法的不平衡数据集分类

Classification for Imbalanced Dataset of Improved Weighted KNN Algorithm

王超学 1潘正茂 1马春森 2董丽丽 1张涛1

作者信息

  • 1. 西安建筑科技大学信息与控制工程学院,西安710055
  • 2. 中国农业科学院植物保护研究所,北京100193
  • 折叠

摘要

Abstract

Based on analyzing the shortages of K-Nearest Neighbor(KNN) algorithm in solving classification problems on imbalanced dataset, a novel KNN approach based on weight strategy(GAK.-KNN) is presented. The key of GAK-KNN lies on defining a new weight assignment model, which can fully take into account the adverse effects caused by the uneven distribution of training sample between classes and within classes. The specific steps are as follows: use K-means algorithm based on Genetic Algorithm(GA) to cluster the training sample set, compute the weight for each training sample in accordance to the clustering results and weight assignment model, use the improved KNN algorithm to classify the test samples. GAK-KNN algorithm can significantly improve the identification rate of the minority samples and overall classification performance. Theoretical analysis and comprehensive experimental results on the UCI dataset confirm the claims.

关键词

不平衡数据集/分类/K最邻近算法/权重分配模型/遗传算法/K-means算法

Key words

imbalanced dataset/ classification/ K-Nearest Neighbor(KNN) algorithm/ weight assignment model/ Genetic Algorithm(GA)/ K-means algorithm

分类

信息技术与安全科学

引用本文复制引用

王超学,潘正茂,马春森,董丽丽,张涛..改进型加权KNN算法的不平衡数据集分类[J].计算机工程,2012,38(20):160-163,168,5.

基金项目

国家自然科学基金资助项目(31170393) (31170393)

陕西省自然科学基金资助项目(2012JM8023) (2012JM8023)

陕西省教育厅自然科学专项基金资助项目(12JK0726) (12JK0726)

计算机工程

OACSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文