| 注册
首页|期刊导航|电讯技术|结合词性的短文本相似度算法及其在文本分类中的应用

结合词性的短文本相似度算法及其在文本分类中的应用

黄贤英 李沁东 刘英涛

电讯技术2017,Vol.57Issue(1):78-82,5.
电讯技术2017,Vol.57Issue(1):78-82,5.DOI:10.3969/j.issn.1001-893x.2017.01.013

结合词性的短文本相似度算法及其在文本分类中的应用

A Grammatical Category-combined Short-text Similarity Algorithm and Its Application in Text Categorization

黄贤英 1李沁东 1刘英涛1

作者信息

  • 1. 重庆理工大学 计算机科学与工程学院,重庆400054
  • 折叠

摘要

Abstract

To address the problem that the categorization accuracy of hownet-based short-text similarity calculation method in short-text is low,a grammatical category-combined short-text similarity algorithm ( GCSSA) is proposed. Based on short-text hownet semantic similarity calculation method and combing with categorized features words,this method adds keywords grammatical category analysis,targets at catego-rized features words and the grammatical category information of keywords,gives different weights for differ-ent keywords,so as to differentiate the importance of various items' contribution in the text similarity calcu-lation of short-texts. Experiments show that compared with hownet-based short-text categorization algo-rithm,the proposed method improves the macro-average and micro-average accuracy by 4% in short-text categorization,and improves the short-text categorization accuracy effectively.

关键词

短文本分类/短文本相似度/词性/hownet语义/分类准确率

Key words

short text categorization/short-text similarity/grammatical category/hownet semantic/categori-zation accuracy

分类

信息技术与安全科学

引用本文复制引用

黄贤英,李沁东,刘英涛..结合词性的短文本相似度算法及其在文本分类中的应用[J].电讯技术,2017,57(1):78-82,5.

基金项目

国家自然科学基金资助项目(11547148) (11547148)

重庆市教委科技计划项目(16SKGH133) (16SKGH133)

重庆市社会科学规划博士项目(2015BS059) (2015BS059)

电讯技术

OA北大核心CSTPCD

1001-893X

访问量0
|
下载量0
段落导航相关论文