现代情报2024,Vol.44Issue(1):37-47,11.DOI:10.3969/j.issn.1008-0821.2024.01.004
基于神经网络词嵌入的大数据关注热点和词嵌入概貌比较研究
Comparative Study on Focus of Attention and Word Embedding General Picture of Big Data Based on Neural Network Word Embedding
摘要
Abstract
[Purpose/Significance]Big data have significant impacts on social and economic development.This study compares focus of attention and word embedding general picture of big data between academic platforms and social Q&A platforms in China,with the aim of promoting big data research and practice in China.[Method/Process]Word2vec is an emerging neural network word embedding algorithm with low computational cost and high accuracy,and can effectively measure word similarity at both semantic and syntactic levels.Data were first collected from CNKI(China Knowledge Infra-structure)and Zhihu platforms separately to build corpora of academic and social Q&A platforms.Then,the Word2vec models were trained based on these two corpora respectively.Based on the analysis of the most similar words,comparisons of focus of attention of big data between academic and social Q&A platforms were conducted.Finally,dimension reduction and visualization algorithm were employed to conduct comparison of word embedding general picture between two platforms.[Result/Conclusion]The results illustrate the differences of big data between academic and social Q&A platforms.This study innovatively employs the Word2vec neural network word embedding algorithm to conduct a comparative analysis of big data between academic platforms and social Q&A platforms in China,providing a new perspective for big data research.关键词
大数据/比较研究/学术平台/社会化问答平台/Word2vecKey words
big data/comparative study/academic platforms/social Q&A platforms/Word2vec分类
社会科学引用本文复制引用
周爱霞,严亚兰,查先进..基于神经网络词嵌入的大数据关注热点和词嵌入概貌比较研究[J].现代情报,2024,44(1):37-47,11.基金项目
国家自然科学基金项目"社会学习和心理授权视角下智能推荐用户信息行为影响规律研究"(项目编号:72174148) (项目编号:72174148)
国家自然科学基金项目"社会连接和认知负荷视角下网络用户从众信息行为研究"(项目编号:71874124) (项目编号:71874124)
武汉大学研究生"课程思政"示范课程项目"信息分析与竞争情报". ()