深圳大学学报(理工版)2023,Vol.40Issue(6):640-648,9.DOI:10.3724/SP.J.1249.2023.06640
RPA:一种内存高效的度量空间recall@R近似最近邻搜索索引
RPA:a memory-efficient metric-space recall@R ANNS index
摘要
Abstract
Approximate nearest neighbor search(ANNS)for high dimensional data has received extensive research efforts.Many existing ANNS methods in metric spaces are essentially based on the permutation of pivots,or pre-selected reference points,which are ordered by their distances to the data.Unfortunately,most of these permutation-based methods are quite memory demanding,because of either complexity of index structure,excessive pivots,or insufficient exploitation of distances.This paper proposes the reduced permutation array(RPA),a delicate integration of existing technical elements based on in-depth understanding,for recall@R ANNS in metric spaces.As an array,for each data element,RPA stores only l identities(IDs)of its nearest pivots out of k pre-selected ones(l≪k).During search,the order of distances from each data element to the query is approximated by a score function based on the distances from its nearest pivots to the query,and a bounded heap is maintained to store R candidate results.RPA is simple,memory-efficient,and scalable.Experiments show that RPA is approximately 3 times more memory-efficient than existing methods on average.The research results can provide an effective ANNS method for massive data in a single-machine environment with limited memory resources.关键词
计算机科学与技术/近似最近邻搜索/度量空间/索引结构/支撑点选择/支撑点序列/内存高效Key words
computer science and technology/approximate nearest neighbor search/metric space/index structure/pivot selection/pivot permutation/memory-efficient分类
信息技术与安全科学引用本文复制引用
江润本,陈家颖,毛睿..RPA:一种内存高效的度量空间recall@R近似最近邻搜索索引[J].深圳大学学报(理工版),2023,40(6):640-648,9.基金项目
National Natural Science Foundation of China(62072311) 国家自然科学基金资助项目(62072311) (62072311)