| 注册
首页|期刊导航|计算机技术与发展|大规模非结构化数据的索引技术研究

大规模非结构化数据的索引技术研究

时亚南 张太红 陈燕红 郭斌

计算机技术与发展Issue(12):109-113,5.
计算机技术与发展Issue(12):109-113,5.DOI:10.3969/j.issn.1673-629X.2014.12

大规模非结构化数据的索引技术研究

Study on Large-scale Unstructured Data Indexing Technology

时亚南 1张太红 1陈燕红 2郭斌1

作者信息

  • 1. 新疆农业大学 计算机与信息工程学院,新疆 乌鲁木齐 830052
  • 2. 中国农业大学 信息与电气工程学院,北京 100083
  • 折叠

摘要

Abstract

To solve the problem that in large-scale data condition the ASPSeek search engine retrievals inefficiently,has large disk space occupancy and can’t be conducive to update,propose an inverted index-organized technique based on block storage,and make a per-formance comparison research test between external memory based B+tree index and linear hash index.Test results show that,for queries per million data-consuming linear hashing to B+tree index is 57.40%,for inserting per million data-consuming linear hash is 2.44 times to B+tree index,for deleting every million data-consuming linear hash to B+tree index is 83.52%,linear hash index file size is 109.56% of B+tree index file size.According to the test results,B+tree index has the faster index building and updating speed,while linear hash index has the higher disk space occupancy rates and better query performance.

关键词

大规模数据/倒排索引/分块式存储/线性散列/B+树

Key words

large-scale data/inverted index/block storage/linear hash/B+tree

分类

信息技术与安全科学

引用本文复制引用

时亚南,张太红,陈燕红,郭斌..大规模非结构化数据的索引技术研究[J].计算机技术与发展,2014,(12):109-113,5.

基金项目

新疆自治区高校科研计划项目(XJEDU2013S13);新疆维吾尔自治区科技攻关项目(200931103);新疆农业大学前期资助课题 ()

计算机技术与发展

OACSTPCD

1673-629X

访问量0
|
下载量0
段落导航相关论文