北京师范大学学报(自然科学版)2009,Vol.45Issue(3):247-249,3.
一种基于SVM的网页层次分类算法
A HIERARCHY CATEGORIZATION ALGORITHM BASED ON SVM
摘要
Abstract
Web page classification plays an important role in information retrieval and social network. Focusing on features of various types and large scales for web pages, this paper provides a hierarchy categorization algorithm based on statistical classification algorithms. After building a categorization system for automatic classification, for a web page to be categorized, the first beginning from the root node to find the corresponding categories, and then it is categorized down until finding the corresponding recursive lowest subclass. The model uses support vector machines as the categorization model, and uses the categorization balanced approach to solve the problem of data sparseness. After being trained on large-scale corpus web pages, it has better performance than general solutions.关键词
层次分类/支持向量机/网页分类Key words
hierarchy categorization/ support vector machines/ web page classification分类
信息技术与安全科学引用本文复制引用
马乐,翁智生,罗军..一种基于SVM的网页层次分类算法[J].北京师范大学学报(自然科学版),2009,45(3):247-249,3.基金项目
山东省自然科学基金资助项目(Y2007G19) (Y2007G19)