计算机技术与发展Issue(2):175-180,6.DOI:10.3969/j.issn.1673-629X.2015.02.040
中文分词与词性标注研究
Study on Word Segmentation and Part-of-speech Tagging
摘要
Abstract
Word segmentation and Part-Of-Speech ( POS) tagging are the basic task of the CLP ( Chinese Language Processing) and are widely applied in the semantic understanding,machine translation,information retrieval and other fields. In this paper,based on collecting current research and application results of word segmentation and part-of-speech tagging,analyze and classify the basic methods of Chi-nese Word Segmentation ( CWS) and POS tagging. First in terms of word segmentation,dictionary-based segmentation method and sta-tistics-based segmentation method were introduced in detail and some word segmentation results of the competition were also listed. Sec-ondly in terms of POS tagging,rule-based method and statistics-based method were expounded. Next,the main methods of building the model for joint CWS and POS tagging were presented. In this paper,also analyze the advantages and disadvantages for methods of CWS and POS tagging,based on which suggestions for the further development are put forward.关键词
中文分词/主动学习/词性标注/自然语言处理/一体化模型Key words
Chinese word segmentation/active learning/POS tagging/CLP/joint model分类
信息技术与安全科学引用本文复制引用
梁喜涛,顾磊..中文分词与词性标注研究[J].计算机技术与发展,2015,(2):175-180,6.基金项目
国家自然科学基金资助项目(61302157) (61302157)
教育部人文社会科学研究青年基金(12YJC870008) (12YJC870008)
江苏省教育高校哲学社会科学基金(2013SJB870004) (2013SJB870004)
江苏省社科研究文化精品课题(12SWC-030) (12SWC-030)