高技术通讯2018,Vol.28Issue(1):39-51,13.DOI:10.3772/j.issn.1002-0470.2018.01.006
基于增量学习和Lasso融合的数据可视化模式识别方法
The data visualization and pattern recognition method based on the fusion of incremental learning and Lasso
摘要
Abstract
A data visualization and pattern recognition method based on the fusion of incremental learning and least abso-lute shrinkage and selection operator(Lasso)feature selection is proposed.The method selects the features of the normalized data by the first-order Lasso to deduce the dimensions.When the granular computing of the continuous data is completed by using the Gini index,the data is then sent to the incremental learning system.The second-or-der Lasso feature selection is used to deal with the increasing dimensions, and the attribute partial order structure diagram is generated to visualize the rules concerned.Five databases from UCI and five classifiers(1NN,3NN, SVM,Adaboost,and Random Forest)are selected to make comparison with the precision result of the proposed method.The result shows that the precision of the method is higher than that of other algorithms generally,and the attribute partial order structure diagram has clear layers and structures.The incremental learning experiment is de-signed to testify the relationships of the precision and update of the structures of the diagram with different incre -mental learning proportions.When the proportion reaches 40%, the precision of the Pima Indians Diabetes data-base(77.66%)can exceed over the Adaboost(75.32%), SVM(77.27%), 1NN(59.74%)and 3NN (75.97%)algorithm with learning process of all of data.The result shows that the method proposed is an effective tool for the visualization and pattern recognition.关键词
增量学习/最小绝对值收缩和选择算子(Lasso)/属性偏序结构图/可视化/模式识别/粒化Key words
incremental learning/least absolute shrinkage and selection operator(Lasso)/attribute partial order structure diagram/visualization/pattern recognition/granulation引用本文复制引用
梁怀新,郝连旺,宋佳霖,郑存芳,洪文学..基于增量学习和Lasso融合的数据可视化模式识别方法[J].高技术通讯,2018,28(1):39-51,13.基金项目
国家自然科学基金(61273019,81373767,61501397,61201111)和河北省自然科学基金(F2016203443)资助项目. (61273019,81373767,61501397,61201111)