| 注册
首页|期刊导航|计算机工程与应用|结合邻居辅助策略的两阶段层次文本分类模型

结合邻居辅助策略的两阶段层次文本分类模型

古平 王春元

计算机工程与应用2017,Vol.53Issue(9):97-102,6.
计算机工程与应用2017,Vol.53Issue(9):97-102,6.DOI:10.3778/j.issn.1002-8331.1601-0405

结合邻居辅助策略的两阶段层次文本分类模型

Two-stage hierarchical text classification model based on neighbor-assistant strategy

古平 1王春元1

作者信息

  • 1. 重庆大学 计算机学院,重庆 400044
  • 折叠

摘要

Abstract

The traditional Two-stage Hierarchical Text Classification model(THTC model)is an effective method to solve the problem of large-scale hierarchical text classification, but it still suffers from low classification accuracy. To alleviate this problem, a new Two-stage Hierarchical Text Classification model based on Neighbor-Assistant strategy(THTC-NA model)is proposed. THTC-NA model consists of two stages:search and classification. In the search stage, the flat strategy is used to select the related categories for a given document from all leaf categories. The categories are ranked and the most related categories are taken as category candidates. Thus, a large-scale hierarchy is pruned into a much smaller but focused one. In the classification stage, the classification results of each candidate are computed by combining the results of ancestor categories and sibling categories of the candidate. Finally, the results of the search stage and the classification stage are fused together todetermine the target category for a given document. The experiments on the data set News-groups-18828 show that, compared with the THTC model, the THTC-NA model has a great help to improve the classifica-tion accuracy.

关键词

两阶段/层次文本分类/邻居辅助策略/类别层次

Key words

two-stage/hierarchical text classification/neighbor-assistant strategy/class hierarchy

分类

信息技术与安全科学

引用本文复制引用

古平,王春元..结合邻居辅助策略的两阶段层次文本分类模型[J].计算机工程与应用,2017,53(9):97-102,6.

基金项目

重庆市自然科学基金项目计划资助项目(No.cstc2012jjA40002) (No.cstc2012jjA40002)

中央高校基本科研基金资助项目(No.106112013CD-JZR180014). (No.106112013CD-JZR180014)

计算机工程与应用

OA北大核心CSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文