电子学报Issue(8):1481-1487,7.DOI:10.3969/j.issn.0372-2112.2015.08.003
中英命名实体识别及对齐中的中文分词优化
Optimization of Chinese Word Segmentation in Named Entity Recognition and Word AIignment
摘要
Abstract
Bilingual named entity recognition and alignment are important for many natural language processing.Named enti-ty translation can improve a lot the performance of the system like statistical machine translation or cross-language information re-trieval.Quality of Chinese word segmentation does have a big impact over named entity (NE)recognition and bilingual NE extrac-tion.Bilingual alignment information provides indications for NE recognition and word segmentation.Accordingly,based on the characteristics of NE recognition,NE alignment,and word segmentation,this paper proposes an optimization algorithm of Chinese word segmentation.By correcting word segmentation error and adjusting word segmentation granularity,the optimization algorithm can enhance extraction effect of Chinese-English NE translation and performance of statistical machine translation.The experimental result on Chinese-English news corpus shows the efficiency of our algorithm.关键词
分词/命名实体识别/双语对齐/机器翻译Key words
word segmentation/named-entity recognition/alignment/machine translation分类
信息技术与安全科学引用本文复制引用
尹存燕,黄书剑,戴新宇,陈家骏..中英命名实体识别及对齐中的中文分词优化[J].电子学报,2015,(8):1481-1487,7.基金项目
国家自然科学基金委优秀国家重点实验室研究项目(No.61223003);国家自然科学基金(青年) ()