吉林大学学报(信息科学版)2024,Vol.42Issue(5):894-900,7.
基于改进ID3算法的非结构化大数据分类优化方法
Optimization Method for Unstructured Big Data Classification Based on Improved ID3 Algorithm
摘要
Abstract
During the classification process of unstructured big data,due to the large amount of redundant data in the data,if the redundant data cannot be cleaned in a timely manner,it will reduce the classification accuracy of the data.In order to effectively improve the effectiveness of data classification,a non structured big data classification optimization method based on the improved ID3(Iterative Dichotomiser 3)algorithm is proposed.This method addresses the problem of excessive redundant data and complex data dimensions in unstructured big data sets.It cleans the data and combines supervised identification matrices to achieve data dimensionality reduction;Based on the results of data dimensionality reduction,an improved ID3 algorithm is used to establish a decision tree classification model for data classification.Through this model,unstructured big data is classified and processed to achieve accurate data classification.The experimental results show that when using this method to classify unstructured big data,the classification effect is good and the accuracy is high.关键词
改进ID3算法/数据清洗/数据降维/非结构化大数据/数据分类方法Key words
improve the iterative dichotomiser 3(ID3)algorithm/data cleaning/data dimensionality reduction/unstructured big data/data classification methods分类
信息技术与安全科学引用本文复制引用
唐锴令,郑皓..基于改进ID3算法的非结构化大数据分类优化方法[J].吉林大学学报(信息科学版),2024,42(5):894-900,7.基金项目
湖南省自然科学基金资助项目(2022JK60058) (2022JK60058)