计算机与数字工程2017,Vol.45Issue(12):2402-2406,2433,6.DOI:10.3969/j.issn.1672-9722.2017.12.017
基于联合模型的中文社交媒体命名实体识别
Named Entity Recognition in Chinese Social Media Base on the Unified Model
摘要
Abstract
Named Entity Recognition(NER)in Chinese social media is important with the development of the internet.Previ?ous methods focus on in-domain supervised learning which is limited by the rare annotated data.However,there are enough corpora in formal domains and massive in-domain unannotated texts which can be used to improve the task.A unified model which can learn from out-of-domain corpora and in-domain unannotated texts is proposed,the unified model contains two major functions,one is for cross-domain learning and the other is for semi-supervised learning.Cross-domain leaning function can learn out-of-domain in?formation based on domain similarity.Semi-Supervised learning function can learn in-domain unannotated information by self-train?ing.Both learning functions outperform existing methods for NER in Chinese social media.Used unified model to experiment get a better result and decrease the workload of manual tagged corpus.关键词
命名实体识别/社交媒体/跨领域学习/领域相似性/半监督学习/主动学习Key words
named entity recognition/social media/cross domain leaning/domain similarity/semi-supervised learning/self-training分类
信息技术与安全科学引用本文复制引用
易黎,黄鹏,彭艳兵,程光..基于联合模型的中文社交媒体命名实体识别[J].计算机与数字工程,2017,45(12):2402-2406,2433,6.基金项目
国家高技术研究发展计划(863计划)(编号:2015AA015603) (863计划)
国家自然科学基金项目(编号:61602114)资助. (编号:61602114)