| 注册
首页|期刊导航|哈尔滨工业大学学报(英文版)|Applying rough sets in word segmentation disambiguation based on maximum entropy model

Applying rough sets in word segmentation disambiguation based on maximum entropy model

哈尔滨工业大学学报(英文版)2006,Vol.13Issue(1):94-98,5.
哈尔滨工业大学学报(英文版)2006,Vol.13Issue(1):94-98,5.

Applying rough sets in word segmentation disambiguation based on maximum entropy model

Applying rough sets in word segmentation disambiguation based on maximum entropy model

1

作者信息

  • 1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;University of Technology Sydney, Sydney
  • 折叠

摘要

Abstract

To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation ( WSD), this paper proposes to apply rough sets in WSD based on the Maximum Entropy model. Firstly, rough set theory is applied to extract the complicated features and long distance features, even from noise or inconsistent corpus. Secondly, these features are added into the Maximum Entropy model, and consequently, the feature weights can be assigned according to the performance of the whole disambiguation model. Finally, the semantic lexicon is adopted to build class-based rough set features to overcome data sparseness. The experiment indicated that our method performed better than previous models, which got top rank in WSD in 863 Evaluation in 2003. This system ranked first and second respectively in MSR and PKU open test in the Second International Chinese Word Segmentation Bakeoff held in 2005.

关键词

word segmentation/feature extraction/rough sets/maximum entropy

Key words

word segmentation/feature extraction/rough sets/maximum entropy

分类

信息技术与安全科学

引用本文复制引用

..Applying rough sets in word segmentation disambiguation based on maximum entropy model[J].哈尔滨工业大学学报(英文版),2006,13(1):94-98,5.

基金项目

Sponsored by the Key Program Projects of National Natural Science Foundation of China ( Grant No. 60435020 ) and National 863 Program ( Grant No.2002AA117010-90). ( Grant No. 60435020 )

哈尔滨工业大学学报(英文版)

1005-9113

访问量3
|
下载量0
段落导航相关论文