首页|期刊导航|数据采集与处理|基于聚类和核密度估计假设检验的异常值检测方法

基于聚类和核密度估计假设检验的异常值检测方法

周春蕾田品卓杨晨琛王皓

数据采集与处理2017，Vol.32Issue(5)：997-1004,8.

数据采集与处理2017，Vol.32Issue(5)：997-1004,8.DOI:10.16337/j.1004-9037.2017.05.017

基于聚类和核密度估计假设检验的异常值检测方法

Outlier Detection Based on Clustering and KDE Hypothesis Testing

周春蕾 ¹田品卓 ²杨晨琛 ¹王皓¹

作者信息

1. 南京大学计算机软件新技术国家重点实验室,南京,210023
2. 江苏方天电力技术有限公司,南京,211102
折叠

摘要

Abstract

Outlier detection is the core problem in data mining and is widely used in industrial production.Accurate and efficient outlier detection method can reflect the condition of industrial system in time,which provides reference for the relevant personnel.Traditional outlier detection algorithms can't efficiently detect outliers in those data with complicated change modes,small change range and the characteristics of streaming data.In this paper a new method for detecting outliers is proposed.Firstly,the data are clustered into several categories by clustering.The data in the same categories share the common characteristics.In this way,we believe that the data in the same categories are under the same distribution which are simpler to fit than the whole data.So the original complex data distribution can be factored into several simple distributions.Secondly,kernel density estimation (KDE) hypothesis testing is used for abnormal value detection.Experiments in the UCI dataset and real industrial data show that the proposed method is more efficient than traditional methods.

关键词

异常值检测/聚类/假设检验/核密度估计

Key words

outlier detection/clustering/hypothesis testing/kernel density estimation

分类

信息技术与安全科学

引用本文复制引用

周春蕾,田品卓,杨晨琛,王皓..基于聚类和核密度估计假设检验的异常值检测方法[J].数据采集与处理,2017,32(5):997-1004,8.

基金项目

国家自然科学基金(61503178)资助项目（61503178）

江苏省自然科学基金(BK20150587)资助项目. （BK20150587）

数据采集与处理

OA北大核心CSCDCSTPCD

ISSN：1004-9037

访问量0

下载量0

段落导航