| 注册
首页|期刊导航|计算机与数字工程|基于联合模型的中文社交媒体命名实体识别

基于联合模型的中文社交媒体命名实体识别

易黎 黄鹏 彭艳兵 程光

计算机与数字工程2017,Vol.45Issue(12):2402-2406,2433,6.
计算机与数字工程2017,Vol.45Issue(12):2402-2406,2433,6.DOI:10.3969/j.issn.1672-9722.2017.12.017

基于联合模型的中文社交媒体命名实体识别

Named Entity Recognition in Chinese Social Media Base on the Unified Model

易黎 1黄鹏 1彭艳兵 2程光1

作者信息

  • 1. 南京烽火软件科技有限公司 南京 210019
  • 2. 武汉邮电科学研究院 武汉 430074
  • 折叠

摘要

Abstract

Named Entity Recognition(NER)in Chinese social media is important with the development of the internet.Previ?ous methods focus on in-domain supervised learning which is limited by the rare annotated data.However,there are enough corpora in formal domains and massive in-domain unannotated texts which can be used to improve the task.A unified model which can learn from out-of-domain corpora and in-domain unannotated texts is proposed,the unified model contains two major functions,one is for cross-domain learning and the other is for semi-supervised learning.Cross-domain leaning function can learn out-of-domain in?formation based on domain similarity.Semi-Supervised learning function can learn in-domain unannotated information by self-train?ing.Both learning functions outperform existing methods for NER in Chinese social media.Used unified model to experiment get a better result and decrease the workload of manual tagged corpus.

关键词

命名实体识别/社交媒体/跨领域学习/领域相似性/半监督学习/主动学习

Key words

named entity recognition/social media/cross domain leaning/domain similarity/semi-supervised learning/self-training

分类

信息技术与安全科学

引用本文复制引用

易黎,黄鹏,彭艳兵,程光..基于联合模型的中文社交媒体命名实体识别[J].计算机与数字工程,2017,45(12):2402-2406,2433,6.

基金项目

国家高技术研究发展计划(863计划)(编号:2015AA015603) (863计划)

国家自然科学基金项目(编号:61602114)资助. (编号:61602114)

计算机与数字工程

OACSTPCD

1672-9722

访问量0
|
下载量0
段落导航相关论文