计算机应用与软件2016,Vol.33Issue(3):71-75,82,6.DOI:10.3969/j.issn.1000-386x.2016.03.015
基于层叠条件随机场的哈语树库构建技术研究
RESEARCH ON THE TECHNOLOGY OF BUILDING KAZAKH TREEBANK BASED ON CASCADED CONDITIONAL RANDOM FIELD
摘要
Abstract
On the issue of how to improve the processing performance of statistical analysis-based Kazakh syntax parsing algorithm,this paper proposes a method of constructing the Kazakh treebank by human-computer interaction.In automatic syntax annotation stage,it achieves by using the cascade conditional random field model.And between its low-level and high-level models it adds the improved and transformation-based error-driven learning algorithm to carry out automatic syntax annotation and automatic correction of the simple sentences. Finally for special entire marking errors the artificial proofreading will be conducted,thus the method forms the phrase structure-based Kazakh treebank.Experimental results show that this method reduces to a large extent the investment on human power and material resources, improves the parsing accuracy and overall processing efficiency.Moreover,it lays the certain foundation for the Kazakh-based syntactic machine translation and text mining afterwards.关键词
哈萨克语树库/人机交互/层叠条件随机场/错误驱动学习算法Key words
Kazakh treebank/Human-machine interaction/Cascade conditional random fields/Error-driven learning algorithm分类
信息技术与安全科学引用本文复制引用
于智娟,古丽拉·阿东别克..基于层叠条件随机场的哈语树库构建技术研究[J].计算机应用与软件,2016,33(3):71-75,82,6.基金项目
国家自然科学基金项目(61063025,61363062)。 ()