中北大学学报(自然科学版)2018,Vol.39Issue(3):310-315,6.DOI:10.3969/j.issn.1673-3193.2018.03.011
基于属性聚类的离群数据挖掘算法
An Outlier Data Mining Algorithm Using Attribute Clustering
摘要
Abstract
In the real life,there are more and more high-dimensional massive categorical data.Based on method of attribute clustering,a new outlier mining algorithm was proposed.In this algorithm,attrib-ute correlations were computed first,and dividing high-dimensional categorical attributes into multiple attribute subsets.Then,outliers were mined respectively based on outlier score in the multiple attribute subsets.By mining outliers in these subsets of multiple attributes,the k data objects with the largest score were selected as final outliers.The feasibility and effectiveness of the algorithm were verified by u-sing artificial data sets and UCI data sets.Experimental results show that the algorithm improves accu-racy and efficiency,and can be used for high-dimensional massive categorical data's outlier mining.关键词
高维数据/分类数据/离群挖掘/属性聚类Key words
high-dimensional data/categorical data/outlier mining/attribute clustering分类
信息技术与安全科学引用本文复制引用
李俊丽,张继福..基于属性聚类的离群数据挖掘算法[J].中北大学学报(自然科学版),2018,39(3):310-315,6.基金项目
国家青年科学基金资助项目(61602335) (61602335)