首页|期刊导航|哈尔滨工业大学学报(英文版)|Applying rough sets in word segmentation disambiguation based on maximum entropy model
哈尔滨工业大学学报(英文版)2006,Vol.13Issue(1):94-98,5.
Applying rough sets in word segmentation disambiguation based on maximum entropy model
Applying rough sets in word segmentation disambiguation based on maximum entropy model
摘要
Abstract
To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation ( WSD), this paper proposes to apply rough sets in WSD based on the Maximum Entropy model. Firstly, rough set theory is applied to extract the complicated features and long distance features, even from noise or inconsistent corpus. Secondly, these features are added into the Maximum Entropy model, and consequently, the feature weights can be assigned according to the performance of the whole disambiguation model. Finally, the semantic lexicon is adopted to build class-based rough set features to overcome data sparseness. The experiment indicated that our method performed better than previous models, which got top rank in WSD in 863 Evaluation in 2003. This system ranked first and second respectively in MSR and PKU open test in the Second International Chinese Word Segmentation Bakeoff held in 2005.关键词
word segmentation/feature extraction/rough sets/maximum entropyKey words
word segmentation/feature extraction/rough sets/maximum entropy分类
信息技术与安全科学引用本文复制引用
..Applying rough sets in word segmentation disambiguation based on maximum entropy model[J].哈尔滨工业大学学报(英文版),2006,13(1):94-98,5.基金项目
Sponsored by the Key Program Projects of National Natural Science Foundation of China ( Grant No. 60435020 ) and National 863 Program ( Grant No.2002AA117010-90). ( Grant No. 60435020 )