首页|期刊导航|计算机工程|基于计数型布隆过滤器的文本检索模型

基于计数型布隆过滤器的文本检索模型

冯加军王晓琳田青

计算机工程Issue(2)：58-61,4.

计算机工程Issue(2)：58-61,4.DOI:10.3969/j.issn.1000-3428.2014.02.013

基于计数型布隆过滤器的文本检索模型

Text Retrieval Model Based on Counting Bloom Filter

冯加军 ¹王晓琳 ¹田青¹

作者信息

1. 山东大学计算机科学与技术学院，济南 250101
折叠

摘要

Abstract

The distributed text retrieval system is difficult to take both high retrieval efficiency and low cost of index maintenance into account, so this paper proposes a Text Retrieval Model based on Counting Bloom Filter(CBFTRM) to solve the problems above. This model divides the physical node into the data node and the index node, both of which are overlaid with structured P2P network. Each data node is responsible for storing documents, and maintaining the inverted index of the documents. It also transmits the values of Counting Bloom Filter(CBF) which are computed by the inverted index’s keywords to the corresponding index node. Each index node builds a search tree and maintains it when the tree’s leaf node changes. The search tree is built by leaf nodes with the data node’s character(including their counting bloom filter’s value), and its internal nodes with the result computed by the values of counting bloom filter. Simulation result shows that this model locates the document faster, and has less traffic doing index maintenance and higher precision.

关键词

计数型布隆过滤器/搜索树/结构化P2P/文本检索/倒排索引

Key words

Counting Bloom Filter(CBF)/search tree/structured P2P/text retrieval/inverted index

分类

信息技术与安全科学

引用本文复制引用

冯加军,王晓琳,田青..基于计数型布隆过滤器的文本检索模型[J].计算机工程,2014,(2):58-61,4.

基金项目

山东省自然科学基金资助项目(ZR2009GM021) （ZR2009GM021）

计算机工程

OA北大核心CSCDCSTPCD

ISSN：1000-3428

访问量0

下载量0

段落导航