| 注册
首页|期刊导航|数据采集与处理|一种面向文本分类的特征迁移方法

一种面向文本分类的特征迁移方法

赵世琛 王文剑

数据采集与处理2017,Vol.32Issue(3):516-522,7.
数据采集与处理2017,Vol.32Issue(3):516-522,7.DOI:10.16337/j.1004-9037.2017.03.010

一种面向文本分类的特征迁移方法

Feature Transfer Learning for Text Categorization

赵世琛 1王文剑1

作者信息

  • 1. 山西大学计算机与信息技术学院,太原,030006
  • 折叠

摘要

Abstract

Traditional text classification methods assume that feature words in the training set and test set follow the same probability distribution.Nevertheless,deviations exist in a practical application,which can affect the final classification results.To solve the problem,a feature transfer learning algorithm for text categorization is proposed.By calculating the transfer volume and amending the vector space model in the training set,the distribution probability of feature words can be reconciled for the training set and test set.Experiments on Chinese spam filtering and web page classification data sets demonstrate that the proposed method can eliminate the dissimilarity of distributions of feature words,and improve the various indexes of test classification evidently.

关键词

文本分类/迁移学习/迁移量/向量空间模型

Key words

text categorization/transfer learning/transfer volume/vector space model

分类

信息技术与安全科学

引用本文复制引用

赵世琛,王文剑..一种面向文本分类的特征迁移方法[J].数据采集与处理,2017,32(3):516-522,7.

基金项目

国家自然科学基金(60975035,61273291)资助项目 (60975035,61273291)

山西省回国留学人员科研基金(2012008)资助项目. (2012008)

数据采集与处理

OA北大核心CSCDCSTPCD

1004-9037

访问量0
|
下载量0
段落导航相关论文