| 注册
首页|期刊导航|现代情报|面向科技文献多维语义组织的混合倒排索引构建方法

面向科技文献多维语义组织的混合倒排索引构建方法

张敏 李唯 范青

现代情报2024,Vol.44Issue(2):107-114,129,9.
现代情报2024,Vol.44Issue(2):107-114,129,9.DOI:10.3969/j.issn.1008-0821.2024.02.009

面向科技文献多维语义组织的混合倒排索引构建方法

Hybrid Inverted Index Construction Method for Multidimensional Semantic Organization of Scientific and Technical Literature

张敏 1李唯 2范青3

作者信息

  • 1. 中国科学院武汉文献情报中心, 湖北 武汉 430071||科技大数据湖北省重点实验室, 湖北 武汉 430071
  • 2. 武汉软件工程职业学院(武汉开放大学), 湖北 武汉 430205
  • 3. 华中师范大学国家文化产业研究中心, 湖北 武汉 430079
  • 折叠

摘要

Abstract

[Purpose/Significance]In order to meet the urgent needs of researchers for efficient querying of fine-grained semantic information within scientific and technological literature,previous studies have proposed a multidimension-al semantic indexing system for scientific and technological literature,however,the common inverted indexes based on HashMap lead to inefficient querying.This paper aims to improve the semantic query performance by establishing hybrid in-verted indexes for different dimensional semantic features.[Method/Process]This paper explored the inverted index con-struction methods suitable for different semantic dimensions with Treap,B+tree and other data structures,and combined them to form a variety of hybrid inverted index construction methods suitable for multidimensional semantic organization of scientific and technological literature,and analyzed and verified the query performance of the different types of inverted in-dex construction methods under the conditions of Top-k query and Boolean query through comparative experiments.[Re-sult/Conclusion]The experimental results show that among the eight hybrid inverted index construction methods formed by the combination,C3(HHHB)shown in Table 2 is proved to have the highest efficiency under the condition of Top-k que-ry,while C4(TTTB)is proved to be the most efficient under the condition of Boolean query.The method in this paper can effectively solve the query efficiency problem caused by a single index structure.

关键词

科技文献/语义组织/混合倒排索引/HashMap/Treap/B+树

Key words

scientific and technical literature/semantic organization/hybrid inverted index/hashMap/treap/B+Tree

分类

社会科学

引用本文复制引用

张敏,李唯,范青..面向科技文献多维语义组织的混合倒排索引构建方法[J].现代情报,2024,44(2):107-114,129,9.

基金项目

国家社会科学基金艺术学项目"非物质文化遗产智能传播的内在机理与进阶路径研究"(项目编号:22CH188) (项目编号:22CH188)

国家社会科学基金艺术学项目"非物质文化遗产智能传播的内在机理与进阶路径研究"(项目编号:22CH188) (项目编号:22CH188)

科技大数据湖北省重点实验室开放基金课题资助项目"科学文化传播领域大数据资源开放平台建设"(项目编号:E3KF291001). (项目编号:E3KF291001)

现代情报

OA北大核心CHSSCDCSSCICSTPCD

1008-0821

访问量0
|
下载量0
段落导航相关论文