计算机工程与应用2016,Vol.52Issue(20):98-102,176,6.DOI:10.3778/j.issn.1002-8331.1502-0078
基于聚类的两段式孤立点检测算法
Two-part outlier detection algorithm based on clustering
摘要
Abstract
Most of the existing outlier detection algorithms need to preset the number of outliers, and also lack of detection capability of non-uniform data set. In view of the above problems, it puts forward the two-part outlier detection algorithm based on clustering, this algorithm first uses DBSCAN clustering algorithm to produce suspected outlier set, then pruning strategy is used for pruning data set, and the outlier detection algorithm based on improved distance is used to produce the sorting set of the points which most likely to be outliers. Eventually the isolated point set is determined by the intersection of the two sets. The algorithm doesn’t need to preset the number of outliers, with the higher accuracy and detection efficiency, and is not sensitive to the distribution of the data set. The experimental results on data set show that the algorithm can effec-tively and accurately identify the outliers.关键词
孤立点检测/距离/DBSCAN算法/剪枝Key words
outlier detection/distance/DBSCAN algorithm/pruning分类
信息技术与安全科学引用本文复制引用
任建华,高立明..基于聚类的两段式孤立点检测算法[J].计算机工程与应用,2016,52(20):98-102,176,6.基金项目
国家青年科学基金项目(No.61003162);辽宁省教育厅项目(No.L2013131)。 ()