哈尔滨工业大学学报(英文版)2008,Vol.15Issue(6):856-861,6.
BC-iDistance: an optimized high-dimensional index for KNN processing
BC-iDistance: an optimized high-dimensional index for KNN processing
摘要
Abstract
To facilitate high-dimensional KNN queries, based on techniques of approximate vector presentation and one-dimensional transformation, an optimal index is proposed, namely Bit-Code based iDistance ( BC-iDis-tance). To overcome the defect of much information loss for iDistance in one-dimensional transformation, the BC-iDistance adopts a novel representation of compressing a d-dimensional vector into a two-dimensional vector, and employs the concepts of bit code and one-dimensional distance to reflect the location and similarity of the data point relative to the corresponding reference point respectively. By employing the classical B + tree, this representation realizes a two-level pruning process and facilitates the use of a single index structure to further speed up the processing. Experimental evaluations using synthetic data and real data demonstrate that the BC-iDistance outperforms the iDistance and sequential scan for KNN search in high-dimensional spaces.关键词
high-dimensional index/KNN seareh/bit code/approximate vectorKey words
high-dimensional index/KNN seareh/bit code/approximate vector分类
信息技术与安全科学引用本文复制引用
LIANG Jun-jie,FENG Yu-cai..BC-iDistance: an optimized high-dimensional index for KNN processing[J].哈尔滨工业大学学报(英文版),2008,15(6):856-861,6.基金项目
Sponsored by the National High Technology Research and Development Program of China (863 Program)(Grant No. [2005]555). (863 Program)