计算机工程与应用2016,Vol.52Issue(17):73-78,117,7.DOI:10.3778/j.issn.1002-8331.1412-0144
基于词向量的微博事件追踪方法
Method of micro-blog event tracking based on word vector
摘要
Abstract
The traditional methods in micro-blog events tracking do not achieve good performance, because the length of micro-blog text is shorter and the cyber-words emerge constantly. To solve this problem, a method of micro-blog event tracking based on word vector is proposed. By using word vector, semantic similarity between the words can be computed, and the accuracy of semantic similarity between micro-blogs can also be improved. Firstly, the Skip-gram model is trained to get the word vector by using a large dataset. Then, the models for initial event and micro-blogs are constructed by extracting the keywords. Finally, the semantic similarities between micro-blogs and the initial event are computed through word vector, and the task of event tracking is completed according to the decision of pre-defined threshold. The experi-mental results show that the proposed method can make full use of semantic information contained by word vector, which can effectively improve the tracking performance compared with traditional methods.关键词
微博/事件追踪/短文本/Skip-gram模型/词向量/语义信息Key words
micro-blog/event tracking/short text/Skip-gram model/word vector/semantic information分类
信息技术与安全科学引用本文复制引用
张佳明,席耀一,王波,唐浩浩,李天彩..基于词向量的微博事件追踪方法[J].计算机工程与应用,2016,52(17):73-78,117,7.基金项目
国家高技术研究发展计划(863)(No.2011AA7032030D);全军军事研究生课题资助项目(No.2011JY002-158);国家社会科学基金项目(No.14BXW028)。 ()