| 注册
首页|期刊导航|郑州大学学报(理学版)|Lucene.net中文分词算法分析

Lucene.net中文分词算法分析

周拴龙

郑州大学学报(理学版)2011,Vol.43Issue(3):73-77,5.
郑州大学学报(理学版)2011,Vol.43Issue(3):73-77,5.

Lucene.net中文分词算法分析

Analysis on Chinese Segmentation Algorithm of Lucene.net

周拴龙1

作者信息

  • 1. 郑州大学信息管理系,河南郑州450001
  • 折叠

摘要

Abstract

The segment of Chinese word relies on the Class Analyzer. By analyzing the five built-in analyzers of Lucene. Net, it was found that their segment were based on the single character of KeywordAna-lyzer, StandardAnalyzer, StopAnalyzer, SimpleAnalyzer and WhitespaceAnalyzer. An improted segment kit for a better Chinese information disposal was added. By testing the three typical kits, Chinese Analyzer, CJKAnalyzer and IKAnalyzer, it was found that IKAnalyzer which uses Dictionary participle and the positive and negative two-way search method, worked well.

关键词

Lucene;中文分词;Analyzer类

Key words

Lucene/Chinese word segment/Class Analyzer

分类

信息技术与安全科学

引用本文复制引用

周拴龙..Lucene.net中文分词算法分析[J].郑州大学学报(理学版),2011,43(3):73-77,5.

郑州大学学报(理学版)

OA北大核心CSTPCD

1671-6841

访问量2
|
下载量0
段落导航相关论文