|国家科技期刊平台
首页|期刊导航|计算机科学与探索|平滑非负低秩图表示聚类算法

平滑非负低秩图表示聚类算法OA北大核心CSTPCD

Smooth Non-negative Low-Rank Graph Representation for Clustering

中文摘要英文摘要

针对现有低秩图表示算法在构建表示图时未能精确捕获数据的全局表示结构、未能充分利用数据有效信息指导表示图的构建以及构建的表示图不具有适于聚类的连通结构等问题,提出了平滑非负低秩图表示聚类算法(SNLRR).SNLRR采用一种更符合矩阵秩特性的对数行列式函数代替核范数平滑地估计秩,有效降低矩阵较大奇异值对秩估计的影响,平衡了所有奇异值对秩估计的贡献比重,增强秩估计的准确性,从而更精准地捕获数据的全局表示结构.为了更加准确地捕获数据局部表示结构,SNLRR引入距离正则项为每个数据点自适应地分配最优近邻学习表示矩阵.此外,SNLRR对表示矩阵的拉普拉斯矩阵施加秩约束,使最终学习到的表示图具有与簇个数相同数量的连通分量,即表示图具有适于聚类的连通结构.与八个对比算法在七个高维且分布复杂的数据集上的实验结果显示,SNLRR算法的聚类性能均优于八种对比算法,Accuracy平均提高了0.207 3,NMI平均提高了0.175 8.因此,SNLRR是一个能够有效处理维度高且分布复杂数据的图表示聚类算法.

The existing low-rank graph representation algorithms fail to capture the global representation structure of data accurately,and cannot make full use of the valid information of data to guide the construction of the repre-sentation graph,then the constructed representation graph does not have a connected structure suitable for cluster-ing.A smooth non-negative low-rank graph representation method for clustering(SNLRR)is proposed to solve these problems.To more accurately capture the global representation structure of data,SNLRR uses a logarithmic determinant function that is more consistent with the rank characteristics of the matrix to replace the kernel norm to estimate the rank function smoothly,which can effectively reduce the impact of larger singular values of the matrix on the rank estimation,balance the contribution of all singular values to the rank estimation,enhance the accuracy of the rank estimation,so as to more accurately capture the global representation structure of the data.The distance regularization term is also introduced to adaptively assign the optimal nearest neighbor learning representation ma-trix for each data point to capture the local representation structure of data.Besides,SNLRR applies rank constraint on the Laplace matrix of representation matrix so that the learned representation graph has the same number of con-nected components as the real number of clusters,that is,the resulting representation graph has a interconnected structure suitable for clustering.Experimental results on seven datasets with high dimensions and complex distribu-tion,using eight comparison algorithms,show that the clustering performance of SNLRR algorithm is better than that of the eight comparison algorithms,with an average increase of 0.2073 in accuracy and 0.1758 in NMI.There-fore,SNLRR is a graph representation clustering algorithm that can effectively handle data with high dimensions and complex distribution.

钱罗雄;陈梅;张弛;张锦宏;马学艳

兰州交通大学 电子与信息工程学院,兰州 730070

计算机与自动化

聚类低秩表示秩约束对数行列式低秩

clusteringlow-rank representationrank constraintlogarithmic determinant low rank

《计算机科学与探索》 2024 (003)

659-673 / 15

国家自然科学基金(62266029);甘肃省重点研发计划(21YF5GA053);甘肃省高等学校产业支撑计划项目(2022CYZC-36).This work was supported by the National Natural Science Foundation of China(62266029),the Key Research and Development Pro-gram of Gansu Province(21YF5GA053),and the Industry Support Program of Higher Education of Gansu Province(2022CYZC-36).

10.3778/j.issn.1673-9418.2212041

评论