| 注册
首页|期刊导航|计算机应用与软件|基于未登录词识别的微博评价短语抽取方法

基于未登录词识别的微博评价短语抽取方法

汪龙庆 张超 宋晖 刘振宇

计算机应用与软件2017,Vol.34Issue(6):284-291,8.
计算机应用与软件2017,Vol.34Issue(6):284-291,8.DOI:10.3969/j.issn.1000-386x.2017.06.051

基于未登录词识别的微博评价短语抽取方法

A METHOD OF TARGET PHRASE EXTRACTION FROM MICROBLOG BASED ON UNKNOWN WORDS RECOGNITION

汪龙庆 1张超 1宋晖 1刘振宇2

作者信息

  • 1. 东华大学计算机科学与技术学院 上海 201600
  • 2. 上海计算机软件技术开发中心上海市计算机软件评测重点实验室 上海 201112
  • 折叠

摘要

Abstract

As the topic of microblog content is scattered, the identification of microblog comment object is the hot and difficult point of microblog emotion analysis.The research shows that unknown words recognition is one of the important reasons leading to the low recognition rate of target phrase.To solve this problem, this paper proposes a method of learning unknown words recognition model based on statistical features such as word frequency, cohesion, left and right degrees of freedom.The experimental results show that the unknown words in the microblog text are automatically added to the target phrase recognition algorithm based on CRFs, and the accuracy and recall of the phrase recognition are improved remarkably.The learning algorithm of unknown words has strong feasibility by directly using annotated samples of target phrase recognition.

关键词

微博/评价短语/未登录词/统计特征/CRFs

Key words

Microblog/Target phrase/Unknown words/Statistical feature/CRFs

分类

信息技术与安全科学

引用本文复制引用

汪龙庆,张超,宋晖,刘振宇..基于未登录词识别的微博评价短语抽取方法[J].计算机应用与软件,2017,34(6):284-291,8.

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文