| 注册
首页|期刊导航|郑州大学学报(理学版)|基于最大熵模型的词位标注汉语分词

基于最大熵模型的词位标注汉语分词

于江德 王希杰 樊孝忠

郑州大学学报(理学版)2011,Vol.43Issue(1):70-74,5.
郑州大学学报(理学版)2011,Vol.43Issue(1):70-74,5.

基于最大熵模型的词位标注汉语分词

Chinese Word Segmentation via Word-position Tagging Based on Maximum Entropy Model

于江德 1王希杰 1樊孝忠2

作者信息

  • 1. 安阳师范学院,计算机与信息工程学院,河南,安阳,455002
  • 2. 北京理工大学计算机科学技术学院,北京,100081
  • 折叠

摘要

Abstract

The performance of Chinese word segmentation has been greatly improved by word-position-based approaches in recent years.This approach treated Chinese word segmentation as a word-position tagging.With the help of powerful sequence tagging model, word-position-based method quickly rose as a mainstream technique in this field.Feature template selection and tag sets selection was crucial in this method.The technique was studied via using different word-positions tag sets and maximum entropy model.Closed evaluations were performed on corpus from the second international Chinese word segmentation Bakeoff-2005, and comparative experiments were performed on different tag sets and feature templates.Experimental results showed that the feature template set TMPT-6 and six word-position tag sets was much better than the other.

关键词

汉语分词/词位标注/最大熵模型/词位标注集/特征模板

Key words

Chinese word segmentation/ word-position tagging/ maximum entropy model/ word-position tag sets/ feature template

分类

信息技术与安全科学

引用本文复制引用

于江德,王希杰,樊孝忠..基于最大熵模型的词位标注汉语分词[J].郑州大学学报(理学版),2011,43(1):70-74,5.

基金项目

高等学校博士点项目,编号20050007023 ()

河南省高等学校青年骨干教师项目,编号2009GGJS-108. ()

郑州大学学报(理学版)

OA北大核心CSTPCD

1671-6841

访问量0
|
下载量0
段落导航相关论文