| 注册
首页|期刊导航|计算机技术与发展|基于融合数据自表示的离群点检测算法

基于融合数据自表示的离群点检测算法

高亚星 赵旭俊 曹栩阳

计算机技术与发展2023,Vol.33Issue(12):41-48,8.
计算机技术与发展2023,Vol.33Issue(12):41-48,8.DOI:10.3969/j.issn.1673-629X.2023.12.006

基于融合数据自表示的离群点检测算法

An Outlier Detection Algorithm Based on Fusion Data Self-representation

高亚星 1赵旭俊 1曹栩阳1

作者信息

  • 1. 太原科技大学 计算机科学与技术学院,山西 太原 030024
  • 折叠

摘要

Abstract

Data self-representation method can be used for outlier detection,which plays a role in magnifying the difference and correlation among data.However,the existing technologies fail to reflect the influence of correlation among features on outlier detection,so it cannot be used for high-dimensional data.To solve this problem,an outlier detection algorithm based on fusion data self-representation is proposed,which can effectively detect outliers in high-dimensional data.Firstly,a data self-representation method based on feature correlation is proposed,which combines mutual information and information entropy theory to measure the correlation among features of high-dimensional data,and integrates it into the sparse representation process among data,reflecting the complex relationship among features and data.Secondly,a calculation method based on the data self-representation among fusion groups is proposed.The self-representation matrix corresponding to different feature groups is integrated by point multiplication to form a global data self-representation matrix.Finally,an outlier detection algorithm based on fusion data self-representation is proposed.On the directed weighted graph formed by the global data self-representation matrix,outliers are detected by graph random walk.The experimental results show that the detection performance of the proposed algorithm on real datasets and synthetic datasets is higher than that of the com-parison algorithm,which proves that the proposed algorithm has good generalization and stability.

关键词

离群点检测/数据自表示/特征分组/信息熵/随机游走

Key words

outlier detection/data self-representation/feature grouping/information entropy/random walk

分类

信息技术与安全科学

引用本文复制引用

高亚星,赵旭俊,曹栩阳..基于融合数据自表示的离群点检测算法[J].计算机技术与发展,2023,33(12):41-48,8.

基金项目

国家自然科学基金(61572343) (61572343)

国防科技重点实验室基金项目资助(JSY6142219202114) (JSY6142219202114)

山西省应用基础研究计划项目(20210302123223,202103021224275) (20210302123223,202103021224275)

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文