计算机应用研究2017,Vol.34Issue(6):1631-1635,5.DOI:10.3969/j.issn.1001-3695.2017.06.007
基于动词名词和CHI特征选择的中文人物社会关系抽取
Personal social relation extraction in Chinese based on feature selection of CHI,verb and noun
摘要
Abstract
Due to the scarce of labeled Chinese corpus of social relation and the rough classification of personal social relations,this paper used a simple method to labeled eight main types of personal social relation.It was necessary to reduce the dimension of feature vector effectively to avoid the curse of dimensionality and remove the noise characteristics to improve the accuracy of relation extraction,therefore,this paper proposed a feature selection method based on Chi square statistic combination with selection of verb and noun,and used TF-IDF to calculate weight of the feature items.After feature selection,the proposed method was tested by SVM classifier,and the results of F-score and accuracy were improved.In order to make full use of the data set to test the effect of this feature selection method,the validity of the proposed method was tested by using k-fold cross validation.Experimental results show that the classification model generated by this method has high discernibility and generalization ability.关键词
人物关系抽取/人物关系标注/特征选择/CHI/SVM分类器Key words
social relation extraction/social relation labeled/feature selection/CHI/SVM classifier分类
信息技术与安全科学引用本文复制引用
曾辉,唐佳丽,熊李艳,黄晓辉..基于动词名词和CHI特征选择的中文人物社会关系抽取[J].计算机应用研究,2017,34(6):1631-1635,5.基金项目
国家自然科学基金资助项目(61363072,61562027) (61363072,61562027)
江西省科技厅科技成果转移转化计划资助项目(20161 BBI90032) (20161 BBI90032)
江西社会科学“十二五”规划项目(15XW12) (15XW12)
江西省教育厅资助项目(150494) (150494)