电子学报2016,Vol.44Issue(8):1873-1880,8.DOI:10.3969/j.issn.0372-2112.2016.08.015
MapReduce框架下的优化高维索引与KNN查询
Optimized High-Dimensional Index and KNN Query in MapReduce
摘要
Abstract
To address the low efficiency problem caused by the approximate large-scale high-dimensional data query, we propose a novel high-dimensional index and KNN query method,called iPBM,which exploits two main problems,inclu-ding the optimal division on the MapReduce’s data block and their contributions to the computing.Specifically,based on the principles of relativity and parallelity,iPBM employs a two-phase partitioning scheme of clustering and zoning to equally split the data to the available blocks,then we design a distributed two-layer index structure and parallel KNN query algo-rithm.With fully considering the global index,local index and two-dimensional bitcode property,iPBM achieves triple-layers filtering,and thus the number of queried area and the computing cost on the high-dimensional data is minimized.The accura-cy,efficiency and scalability of the proposed iPBM are thoroughly evaluated via detailed simulations.关键词
云计算/MapReduce/KNN查询/高维索引Key words
cloud/MapReduce/KNN query/high-dimensional index分类
信息技术与安全科学引用本文复制引用
梁俊杰,李凤华,刘琼妮,尹利..MapReduce框架下的优化高维索引与KNN查询[J].电子学报,2016,44(8):1873-1880,8.基金项目
国家发改委2012年信息安全专项(No.发改办高技[2013]1309);国家自然科学基金(No.61170251);湖北省自然科学基金重点项目(No.2013CFA115);武汉市科技攻关计划 ()