重庆大学学报2026,Vol.49Issue(2):105-115,11.DOI:10.11835/j.issn.1000-582X.2025.216
基于SVM和归一化熵模型的隐患文本分类与类型特征分析
Classifications and characterization of safety hazard texts
摘要
Abstract
To improve the efficiency of organizing and retrieving safety hazard information and to support more complex information processing tasks,effective technical methods for automatic text classification and type analysis are required.Support Vector Machine(SVM)can automatically classify unstructured text.However,their underlying principle focuses on identifying optimal classification boundaries within the training set and does not facilitate the extraction of representative features for each text category.To address this limitation,a normalized entropy model is proposed to search for typical category features,thereby improving the traditional term frequency-inverse document frequency(TF-IDF)based feature recognition method.Using 2 534 law enforcement inspection records from a government emergency management bureau as a case study,SVM was used for automatic text classification and achieved an accuracy of up to 97%.Meanwhile,the normalized entropy model was used to extract representative features for each category,providing decision support for formulating targeted rectification strategies in hazard investigation.Experimental results show that the combined use of SVM and the normalized entropy model effectively addresses both text classification and category feature recognition tasks.关键词
文本挖掘/数据挖掘/隐患排查/支持向量机/熵Key words
text mining/data mining/hazard investigation/support vector machine/entropy分类
资源环境引用本文复制引用
乔剑锋,刘萱,艾莉莎,张丽玮,王汀..基于SVM和归一化熵模型的隐患文本分类与类型特征分析[J].重庆大学学报,2026,49(2):105-115,11.基金项目
中国高校科技期刊研究会专项基金项目(CUJS2024-GJ-A01). Supported by the Special Fund Project of the Society of China University Journals(CUJS2024-GJ-A01). (CUJS2024-GJ-A01)