| 注册
首页|期刊导航|计算机应用与软件|大规模数据集下基于DBSCAN算法的增量并行化快速聚类

大规模数据集下基于DBSCAN算法的增量并行化快速聚类

王兴 吴艺 蒋新华 廖律超

计算机应用与软件2018,Vol.35Issue(4):269-275,280,8.
计算机应用与软件2018,Vol.35Issue(4):269-275,280,8.DOI:10.3969/j.issn.1000-386x.2018.04.050

大规模数据集下基于DBSCAN算法的增量并行化快速聚类

INCREMENTAL PARALLELIZATION OF FAST CLUSTERING BASED ON DBSCAN ALGORITHM UNDER LARGE-SCALE DATA SET

王兴 1吴艺 2蒋新华 2廖律超1

作者信息

  • 1. 中南大学信息科学与工程学院 湖南长沙410075
  • 2. 福建师范大学数学与信息学院 福建福州350108
  • 折叠

摘要

Abstract

Spatial temporal trajectory data mining is an important way to discover the behavior patterns of mobile objects.Aimed at the demand of massive trajectory data processing,an incremental parallelization fast clustering algorithm was proposed.Based on the number of data points,the algorithm divided the space grid by dichotomy,and combined the greedy algorithm to restructure the partition rationally to reasonably divide the data.It dealt with local clustering to obtain the merged candidate cluster sets.The candidate clusters of R *-tree indexes were merged to be judged and processed.An undirected acyclic graph model of the merged clusters was established and the data was globally re-labeled.The experimental results show that thealgorithm effectively reduces the elastic partition processing noise data and improves the quality of local clustering.The merging strategy adopting R *-tree index structure effectively improves the time efficiency of clustering,and has good clustering effect and realized the online processing of large-scale data.

关键词

大数据/DBSCAN/均衡划分/增量/并行化

Key words

Big data/DBSCAN/Balanced partitioning/Increment/Parallelization

分类

信息技术与安全科学

引用本文复制引用

王兴,吴艺,蒋新华,廖律超..大规模数据集下基于DBSCAN算法的增量并行化快速聚类[J].计算机应用与软件,2018,35(4):269-275,280,8.

基金项目

国家自然科学基金项目(61304199,41471333) (61304199,41471333)

福建省高校杰出青年科研人才计划项目(JA14209) (JA14209)

福建省教育厅项目(JA15325). (JA15325)

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文