基于神经网络词嵌入的大数据关注热点和词嵌入概貌比较研究OACHSSCDCSTPCD
Comparative Study on Focus of Attention and Word Embedding General Picture of Big Data Based on Neural Network Word Embedding
[目的/意义]大数据对社会和经济发展产生了重大影响,本研究结合我国学术平台和社会化问答平台对大数据关注热点和词嵌入概貌进行比较,旨在推动我国大数据研究和实践.[方法/过程]Word2vec是新兴的神经网络词嵌入算法,具有计算成本低、准确度高等特点,能够同时在语义层面和语法层面有效测度词语的相似度.首先,结合中国知网和知乎平台分别搜集了数据,构建出学术平台和社会化问答平台的语料库,再基于这两个语料库分别训练了Word2vec模型;其次,结合最相似词语分析对学术平台和社会化问答平台的大数据关注热点进行了比较;最后,利用降维技术和数据可视化方法对两个平台词语的词嵌入概貌进行了比较.[结果/结论]研究结果显示了我国学术平台和社会化问答平台在大数据方面的差异.本研究创新地利用Word2vec神经网络词嵌入算法,结合我国学术平台和社会化问答平台对大数据进行了比较分析,为大数据研究提供了新的视角.
[Purpose/Significance]Big data have significant impacts on social and economic development.This study compares focus of attention and word embedding general picture of big data between academic platforms and social Q&A platforms in China,with the aim of promoting big data research and practice in China.[Method/Process]Word2vec is an emerging neural network word embedding algorithm with low computational cost and high accuracy,and can effectively measure word similarity at both semantic and syntactic levels.Data were first collected from CNKI(China Knowledge Infra-structure)and Zhihu platforms separately to build corpora of academic and social Q&A platforms.Then,the Word2vec models were trained based on these two corpora respectively.Based on the analysis of the most similar words,comparisons of focus of attention of big data between academic and social Q&A platforms were conducted.Finally,dimension reduction and visualization algorithm were employed to conduct comparison of word embedding general picture between two platforms.[Result/Conclusion]The results illustrate the differences of big data between academic and social Q&A platforms.This study innovatively employs the Word2vec neural network word embedding algorithm to conduct a comparative analysis of big data between academic platforms and social Q&A platforms in China,providing a new perspective for big data research.
周爱霞;严亚兰;查先进
武汉科技大学管理学院, 湖北 武汉 430065武汉科技大学管理学院, 湖北 武汉 430065||武汉科技大学服务科学与工程研究中心, 湖北 武汉 430065武汉大学信息管理学院, 湖北 武汉 430072
大数据比较研究学术平台社会化问答平台Word2vec
big datacomparative studyacademic platformssocial Q&A platformsWord2vec
《现代情报》 2024 (001)
37-47 / 11
国家自然科学基金项目"社会学习和心理授权视角下智能推荐用户信息行为影响规律研究"(项目编号:72174148);国家自然科学基金项目"社会连接和认知负荷视角下网络用户从众信息行为研究"(项目编号:71874124);武汉大学研究生"课程思政"示范课程项目"信息分析与竞争情报".
评论