| 注册
首页|期刊导航|河南农业大学学报|融合Wikipedia分类结构及显式语义特征的短文本检索

融合Wikipedia分类结构及显式语义特征的短文本检索

李璞 张志锋 杨百冰 肖宝 蒋运承

河南农业大学学报2019,Vol.53Issue(2):257-265,9.
河南农业大学学报2019,Vol.53Issue(2):257-265,9.

融合Wikipedia分类结构及显式语义特征的短文本检索

Short text retrieval combining Wikipedia taxonomy and explicit semantic features

李璞 1张志锋 1杨百冰 1肖宝 2蒋运承3

作者信息

  • 1. 郑州轻工业大学软件学院,河南 郑州 450000
  • 2. 北部湾大学电子与信息工程学院,广西 钦州 535000
  • 3. 华南师范大学计算机学院,广东 广州 510631
  • 折叠

摘要

Abstract

Considering the short length, little information, sparse features and irregular grammar of the large number of short text data appeared in the Web information space, traditional information retrieval technology cannot deal with short text effectively. In view of the above problems, in this research the semantic relatedness is taken as the starting point. The short text retrieval technology based on the current mainstream semantic knowledge source Wikipedia is studied. According to the taxonomy information contained in Wikipedia pages,an explicit semantic feature selection and relatedness computation method are proposed. On this basis,a short text retrieval method under low dimensional explicit semantic space is proposed. Finally, the feasibility and effectiveness of the method are verified by experimental tests. The results showed that,compared with the graph-based and link-based methods, this research improves MAP by 6% and 4. 1% ,P@ 30 by 10. 4% and 5. 8% ,R-Prec by 6. 1% and 3%,respectively.

关键词

Wikipedia分类结构/显式语义特征/特征选择/短文本/信息检索

Key words

Wikipedia taxonomy/explicit semantic feature/feature selection/short text/information retrieval

分类

信息技术与安全科学

引用本文复制引用

李璞,张志锋,杨百冰,肖宝,蒋运承..融合Wikipedia分类结构及显式语义特征的短文本检索[J].河南农业大学学报,2019,53(2):257-265,9.

基金项目

国家自然科学基金青年科学基金项目(61802352) (61802352)

国家自然科学基金面上项目(61772210) (61772210)

郑州轻工业大学博士科研基金资助项目(0215 /13501050015) (0215 /13501050015)

广西高校中青年教师科研基础能力提升项目(2019KY046) (2019KY046)

钦州市科学研究与技术开发计划项目(20189903) (20189903)

广州市科技计划项目(2014J4100031) (2014J4100031)

河南农业大学学报

OA北大核心CSCDCSTPCD

1000-2340

访问量4
|
下载量0
段落导航相关论文