计算机与数字工程2025,Vol.53Issue(1):164-169,6.DOI:10.3969/j.issn.1672-9722.2025.01.031
基于MRMR和SVM的短文本分类算法改进研究
Research on Improvement of Short Text Classification Algorithm Based on MRMR and SVM
章启超 1周莲英 1丁腊春2
作者信息
- 1. 江苏大学计算机科学与通信工程学院 镇江 212013
- 2. 江苏省镇江市第四人民医院 镇江 212001
- 折叠
摘要
Abstract
The quality of feature set and the performance of classifier are two important factors that affect the effect of short text classification.MRMR algorithm with maximum feature and minimum redundancy is a commonly used feature dimensionality re-duction algorithm.This paper improves the algorithm by adjusting factor based on word distribution frequency.The adjusting factor will reduce the weight of low-frequency feature words when calculating the feature mutual information value,so as to solve the prob-lem of high dependence between low-frequency words and feature tags.Then,taking support vector machine as the basic classifier,the firefly algorithm with variable step size factor is added to optimize its parameters.The adaptability of variable step size factor solves the oscillation and other phenomena of firefly algorithm.Finally,several SVM basic classifiers with different weights are itera-tively trained by Adaboost framework,and a strong classifier with better performance is integrated.The paper uses the short text da-ta set obtained by the web crawler to verify.Taking the accuracy(P),recall(R)and F1 value as the evaluation criteria,the opti-mized algorithm improves the accuracy by 8%,recall by 10%and F1 value by 9%compared with the original algorithm.Therefore,the experimental results show that the optimized algorithm has higher efficiency.关键词
短文本分类/特征降维/MRMR算法/支持向量机/AdaboostKey words
short text classification/feature dimensionality reduction/MRMR algorithm/support vector machines/Ada-boost分类
信息技术与安全科学引用本文复制引用
章启超,周莲英,丁腊春..基于MRMR和SVM的短文本分类算法改进研究[J].计算机与数字工程,2025,53(1):164-169,6.