首页|期刊导航|计算机应用与软件|一种基于Spark的改进协同过滤算法研究

一种基于Spark的改进协同过滤算法研究

许智宏蒋新宇董永峰赵嘉伟

计算机应用与软件2017，Vol.34Issue(5)：247-254,278,9.

计算机应用与软件2017，Vol.34Issue(5)：247-254,278,9.DOI:10.3969/j.issn.1000-386x.2017.05.043

一种基于Spark的改进协同过滤算法研究

AN IMPROVED COLLABORATIVE FILTERING ALGORITHM BASED ON SPARK

许智宏 ¹蒋新宇 ²董永峰 ¹赵嘉伟¹

作者信息

1. 河北工业大学计算机科学与软件学院天津 300401
2. 河北省大数据计算重点实验室天津 300401
折叠

摘要

Abstract

In order to improve the scalability of collaborative filtering algorithm in big data environment and the recommendation accuracy in high dimensional sparse data, a hierarchical co-clustering collaborative filtering algorithm based on spark is implemented.The data sets are sparsely processed by using co-clustering and the clustering model is constructed.The potential categories weight of users and projects in the co-clustering model are analyzed by using the analytic hierarchy model combined with the score-density analysis.The project similarity is calculated and the project nearest neighbor set is constructed to complete the online recommendation.The experiments different scale MovieLens datasets provided by GroupLens show that the improved algorithm can significantly improve the accuracy of recommendation, and it has good recommendation efficiency and expansibility in distributed environment.

关键词

协同过滤/联合聚类/层次分析模型/Spark

Key words

Collaborative filtering/Co-clustering/Analytic hierarchy model/Spark

分类

信息技术与安全科学

引用本文复制引用

许智宏,蒋新宇,董永峰,赵嘉伟..一种基于Spark的改进协同过滤算法研究[J].计算机应用与软件,2017,34(5):247-254,278,9.

基金项目

天津市科技计划项目(14ZCDGSF00124) （14ZCDGSF00124）

河北省青年科学基金项目(F2015202311). （F2015202311）

计算机应用与软件

OA北大核心CSTPCD

ISSN：1000-386X

访问量0

下载量0

段落导航