计算机科学与探索2024,Vol.18Issue(5):1243-1258,16.DOI:10.3778/j.issn.1673-9418.2212038
紧凑性约束下的形状提取多元时序聚类
Clustering Multivariate Time Series Data Based on Shape Extraction with Com-pactness Constraint
摘要
Abstract
Aiming at the naturalness and structural complexity of multivariate time series(MTS)data as well as the inability of existing algorithms to accurately identify clusters of high-dimensional time series data,the shape extrac-tion multivariate time series clustering algorithm C-Shape under compactness constraints is proposed.Firstly,C-Shape performs largest triangle three buckets processing on the complex MTS to achieve the purpose of using fewer data while keeping the original shape unchanged.The raw data and the processed data are then selected to calculate the compactness between them to ensure the reduced spatial dimensionality is reasonable.Next,new cluster centers are obtained by using shape extraction while effectively preserving the shape integrity of the data,and the final clus-ter is formed by iteration.C-Shape can avoid the difficulty of grasping the low dimensional spatial dimensionality of the traditional down-sampling algorithm by fully taking into account the similarity between the shapes of the pro-cessed data and raw data.To validate its performance,C-Shape is tested with two classical and seven excellent time series clustering algorithms presented in recent years on the eight normal and four imbalanced MTS datasets with di-mensions ranging from tens to thousands,respectively.Experimental results demonstrate all C-Shape clustering ca-pabilities outperform those of the nine baseline algorithms,with an average improvement of 16.33%in Rand index and an average improvement of 69.71%in time performance.Thus C-Shape is an accurate and efficient multivariate time series clustering algorithm.关键词
多元时间序列聚类/降采样/相似度度量/形状提取/时间序列紧凑性Key words
multivariate time series clustering/down-sampling/similarity measurement/shape extraction/time series compactness分类
信息技术与安全科学引用本文复制引用
张弛,陈梅,张锦宏..紧凑性约束下的形状提取多元时序聚类[J].计算机科学与探索,2024,18(5):1243-1258,16.基金项目
国家自然科学基金(62266029) (62266029)
甘肃省重点研发计划(21YF5GA053) (21YF5GA053)
甘肃省高等学校产业支撑计划项目(2022CYZC-36). This work was supported by the National Natural Science Foundation of China(62266029),the Key Research and Development Pro-gram of Gansu Province(21YF5GA053),and the Higher Education Industry Support Program of Gansu Province(2022CYZC-36). (2022CYZC-36)