计算机应用与软件2017,Vol.34Issue(5):247-254,278,9.DOI:10.3969/j.issn.1000-386x.2017.05.043
一种基于Spark的改进协同过滤算法研究
AN IMPROVED COLLABORATIVE FILTERING ALGORITHM BASED ON SPARK
摘要
Abstract
In order to improve the scalability of collaborative filtering algorithm in big data environment and the recommendation accuracy in high dimensional sparse data, a hierarchical co-clustering collaborative filtering algorithm based on spark is implemented.The data sets are sparsely processed by using co-clustering and the clustering model is constructed.The potential categories weight of users and projects in the co-clustering model are analyzed by using the analytic hierarchy model combined with the score-density analysis.The project similarity is calculated and the project nearest neighbor set is constructed to complete the online recommendation.The experiments different scale MovieLens datasets provided by GroupLens show that the improved algorithm can significantly improve the accuracy of recommendation, and it has good recommendation efficiency and expansibility in distributed environment.关键词
协同过滤/联合聚类/层次分析模型/SparkKey words
Collaborative filtering/Co-clustering/Analytic hierarchy model/Spark分类
信息技术与安全科学引用本文复制引用
许智宏,蒋新宇,董永峰,赵嘉伟..一种基于Spark的改进协同过滤算法研究[J].计算机应用与软件,2017,34(5):247-254,278,9.基金项目
天津市科技计划项目(14ZCDGSF00124) (14ZCDGSF00124)
河北省青年科学基金项目(F2015202311). (F2015202311)