| 注册
首页|期刊导航|哈尔滨工业大学学报(英文版)|A fuzzy method to learn text classifier from labeled and unlabeled examples

A fuzzy method to learn text classifier from labeled and unlabeled examples

刘宏 黄上腾

哈尔滨工业大学学报(英文版)2004,Vol.11Issue(1):98-102,5.
哈尔滨工业大学学报(英文版)2004,Vol.11Issue(1):98-102,5.

A fuzzy method to learn text classifier from labeled and unlabeled examples

A fuzzy method to learn text classifier from labeled and unlabeled examples

刘宏 1黄上腾1

作者信息

  • 1. Dept.of Computer Science,Shanghai Jiaotong University,Shanghai 200030,China
  • 折叠

摘要

Abstract

In text classification, labeling documents is a tedious and costly task, as it would consume a lot of expert time. On the other hand, it usually is easier to obtain a lot of unlabeled documents, with the help of some tools like Digital Library, Crawler Programs, and Searching Engine. To learn text classifier from labeled and unlabeled examples, a novel fuzzy method is proposed. Firstly, a Seeded Fuzzy c-means Clustering algorithm is proposed to learn fuzzy clusters from a set of labeled and unlabeled examples. Secondly, based on the resulting fuzzy clusters, some examples with high confidence are selected to construct training data set. Finally,the constructed training data set is used to train Fuzzy Support Vector Machine, and get text classifier. Empirical results on two benchmark datasets indicate that, by incorporating unlabeled examples into learning process,the method performs significantly better than FSVM trained with a small number of labeled examples only. Also, the method proposed performs at least as well as the related method-EM with Naive Bayes. One advantage of the method proposed is that it does not rely on any parametric assumptions about the data as it is usually the case with generative methods widely used in semi-supervised learning.

关键词

text categorization/fuzzy/clustering

Key words

text categorization/fuzzy/clustering

分类

信息技术与安全科学

引用本文复制引用

刘宏,黄上腾..A fuzzy method to learn text classifier from labeled and unlabeled examples[J].哈尔滨工业大学学报(英文版),2004,11(1):98-102,5.

哈尔滨工业大学学报(英文版)

1005-9113

访问量0
|
下载量0
段落导航相关论文