计算机工程与应用Issue(8):32-36,5.DOI:10.3778/j.issn.1002-8331.1208-0490
句法复杂网络作为语体分类的知识源研究
摘要
Abstract
This paper builds six dependence syntactic networks based on six treebanks of different styles and gives a comparative analysis of overall characteristics of the networks, including the number of edges, the number of the nodes, the average degree, the clustering coefficient, the average path length, the centralization, the diameter, and the index of power-law, coefficient of determination. After that, the paper uses the Euclidean“the shortest distance”method, with characteristics as variables, to do clustering analysis of these networks. The results show that using some main parameters of networks, namely the number of the nodes, the clustering coefficient, the average path length, the centralization and the index of power-law, can do cluster analysis on texts. Compared with the traditional text clustering, the results are easier to explain in linguistic angle.关键词
语体/文本分类/网络特征/语言网络Key words
style/text clustering/network characteristics/language networks分类
信息技术与安全科学引用本文复制引用
陈芯莹,刘海涛..句法复杂网络作为语体分类的知识源研究[J].计算机工程与应用,2013,(8):32-36,5.基金项目
国家社会科学基金重大项目(No.11&ZD188) (No.11&ZD188)