计算机技术与发展2016,Vol.26Issue(9):8-11,4.DOI:10.3969/j.issn.1673-629X.2016.09.002
网络舆情信息提取技术研究与实现
Research and Implementation of Information Extraction Technology in Network Public Opinion
摘要
Abstract
Internet public opinion information extraction is the most critical part of public opinion analysis system,which is also a data base of the public opinion analysis and statistics. For this reason,a public opinion information extraction method based on clues topic is designed and implemented. In the method,pages of public opinion as one topic clue is divided to logical region,and the breadth-first search methods based on DOM tree is applied to design extraction algorithm of public opinion information. By setting a minimum repeat topic thresholdƟ,customized extraction format,removed duplicate and noise of information,public opinion extraction is realized effec-tively. By experiment of the public opinion of multiple forums,the results show that this scheme has good extract performance,and the re-call,the correct rate and F measure are higher,which is able to well extract forum and reviews and other public opinion information.关键词
舆情信息/Web信息提取/话题线索/DOC树Key words
public opinion information/Web information extraction/topic clues/DOC tree分类
信息技术与安全科学引用本文复制引用
刘华春,王星捷..网络舆情信息提取技术研究与实现[J].计算机技术与发展,2016,26(9):8-11,4.基金项目
四川省自然科学重点项目(A22012003) (A22012003)
四川省乐山市科技局重点项目(14GZD050) (14GZD050)