| 注册
首页|期刊导航|计算机工程|基于Hadoop平台的相关性权重算法设计与实现

基于Hadoop平台的相关性权重算法设计与实现

高军 黄献策

计算机工程2019,Vol.45Issue(3):26-31,6.
计算机工程2019,Vol.45Issue(3):26-31,6.DOI:10.19678/j.issn.1000-3428.0049976

基于Hadoop平台的相关性权重算法设计与实现

Design and Implementation of Correlation Weight Algorithm Based on Hadoop Platform

高军 1黄献策1

作者信息

  • 1. 上海海事大学 信息工程学院,上海 201306
  • 折叠

摘要

Abstract

The traditional TF-IDF algorithm calculates the correlation weights between keywords and documents only by using the perspective of word frequency and reverse document frequency, which ignoes the influence of user interest on weight calculation.In order to meet the purpose of user information retrieval, a correlation weight algorithm based on journal association is proposed.From the perspective of user-oriented comelation, the user interest model is built by analyzing the user's search journal, and combined with the idea of distributed computing, the MapReduce programming framework is used to realize the parallel processing of computing tasks.Experimental results show that it can not only improve the efficiency of the algorithm when dealing with massive data, but also dynamically change the weight of retrieval word according to the user's historical retrieval records, so as to enhance the interaction ability between users and the system.

关键词

分布式计算/TF-IDF算法/日志/兴趣模型/信息检索

Key words

distributed computing/TF-IDF algorithm/journal/interest model/information retrieval

分类

信息技术与安全科学

引用本文复制引用

高军,黄献策..基于Hadoop平台的相关性权重算法设计与实现[J].计算机工程,2019,45(3):26-31,6.

基金项目

国家自然科学基金(41701523) (41701523)

上海海事大学研究生创新基金(YXR2017032). (YXR2017032)

计算机工程

OA北大核心CSCDCSTPCD

1000-3428

访问量0
|
下载量0
段落导航相关论文