计算机应用与软件2018,Vol.35Issue(2):96-101,6.DOI:10.3969/j.issn.1000-386x.2018.02.017
基于Spark的空间范围查询索引研究
RESEARCH ON RANGE QUERIES IN SPATIAL INDEX BASED ON THE SPARK
摘要
Abstract
As the traditional data processing system,the ability to save and process data is limited, can't meet the needs of dealing with large amounts of data.In order to maximize the value of data sets with high efficiency and high performance,a large data analysis and processing system based on Spark system and SIMBA is proposed,which is based on Spark SQL query method.The index management mechanism is embedded in Spark system, encapsulated in the RDD,which improve the efficiency of query.Through the establishment of line tree to store data, we improve the efficiency of data retrieval.For pre-processing data, RangePartitioner partitioning strategy is used to partition data and query based on global filtering and local index.关键词
Spark系统/大数据/范围查询/SparkSQL组件Key words
Spark system/Big data/Range queries/Components of Spark SQL分类
信息技术与安全科学引用本文复制引用
陈业斌,刘娜,徐宏,刘敏..基于Spark的空间范围查询索引研究[J].计算机应用与软件,2018,35(2):96-101,6.基金项目
安徽省高校自然科学研究重点项目(KJ2015A130). (KJ2015A130)