计算机工程2011,Vol.37Issue(15):174-176,3.DOI:10.3969/j.issn.1000-3428.2011.15.055
基于Viterbi改进算法的高棉语分词研究
Research of Khmer Word Segmentation Based on Improved Viterbi Algorithm
摘要
Abstract
The accuracy of Khmer words segmentation for maximum matching algorithm is relatively low, and it is difficult for this algorithm to recognize words that are not enrolled in its dictionary. To solve this problem, an improved Viterbi algorithm is proposed. Wherein automation is used for syllable segmentation, optimization selection and pruning methods are used to promote the segmentation efficiency, and the statistical language model is adopted to perform data smooth for unknown words in this approach. Experimental results indicate that the improved Viterbi algorithm has higher accuracy and efficiency.关键词
Viterbi算法/最大匹配算法/分词/高棉语/剪枝/统计语言模型Key words
Viterbi algorithm/maximum matching algorithm/word segmentation/Khmer/pruning/statistical language model分类
信息技术与安全科学引用本文复制引用
蒋艳荣,刘习文,陈耿涛..基于Viterbi改进算法的高棉语分词研究[J].计算机工程,2011,37(15):174-176,3.基金项目
广东省自然科学基金资助项目(8151009001000041) (8151009001000041)