青岛大学学报(自然科学版)2011,Vol.24Issue(3):53-58,6.DOI:10.3969/j.issn.1006-1037.2011.08.012
一种基于LUCENE的中文分词算法研究
Algorithm Research of Chinese Word Based on LUCENE
摘要
Abstract
A new Chinese algorithm to improve Lucene Chinese analyzer is proposed, since Chinese Analyzer and CJK Analyzer cannot meet the requirement of full-text searching application. This algorithm is based on the character string rule and combines the forward and reverse to achieve the largest adding word matching algorithm. The difference between improved analyzer and the two Lucene analyzers is compared through experiment simulation. It can be concluded that the improved analyzer is more effective than the other two analyzers. The system ratio and precision ratio meet the users' requirement.关键词
全文检索/Lucene/中文分词Key words
Full-text Retrieval/ Lucene/ Chinese Parting-words分类
信息技术与安全科学引用本文复制引用
戴洪,蒋静,樊程,于雪丽..一种基于LUCENE的中文分词算法研究[J].青岛大学学报(自然科学版),2011,24(3):53-58,6.基金项目
国家支撑计划项目(2006BA111B07) (2006BA111B07)