| 注册
首页|期刊导航|南京理工大学学报(自然科学版)|大数据环境下Lucene性能优化方法研究

大数据环境下Lucene性能优化方法研究

马旸 蔡冰

南京理工大学学报(自然科学版)Issue(3):260-265,6.
南京理工大学学报(自然科学版)Issue(3):260-265,6.DOI:10.14177/j.cnki.32-1397n.2015.39.03.002

大数据环境下Lucene性能优化方法研究

Performance optimization method of Lucene in big data

马旸 1蔡冰1

作者信息

  • 1. 国家计算机网络应急技术处理协调中心江苏分中心,江苏 南京210003
  • 折叠

摘要

Abstract

To improve the data query efficiency in big data, an optimized inverted index method—RAM FS directory( RFDirectory) is proposed here based on memory computing and batch processing technique. A post-list management technique combining random access memory(RAM)and disk is realized based on Lucene. New data are written into a cache,and then written into a disk index peri-odically to improve the writing performance of the inverted index method. Data query results are provided efficiently to consumers by integrating the multiple block inverted structure of the disk and RAM. Experimental results show that the index constructing time of RFDirectory is 50% of that of FSDirectory or RAMDirectory,and the time consuming of returning the index result of one keyword is reduced by 15% in big data.

关键词

大数据/Lucene/内存计算/批量更新/倒排索引/倒排表/缓存/内存索引/磁盘索引/多分块倒排结构

Key words

big data/Lucene/memory computing/batch processing/inverted index/post-list/cache/random access memory index/disk index/multiple block inverted structure

分类

信息技术与安全科学

引用本文复制引用

马旸,蔡冰..大数据环境下Lucene性能优化方法研究[J].南京理工大学学报(自然科学版),2015,(3):260-265,6.

南京理工大学学报(自然科学版)

OA北大核心CSCDCSTPCD

1005-9830

访问量0
|
下载量0
段落导航相关论文