|国家科技期刊平台
首页|期刊导航|控制理论与应用|集成数据挖掘知识的可解释最优超球体支持向量机

集成数据挖掘知识的可解释最优超球体支持向量机OA北大核心CSTPCD

Interpretable small sphere and large margin support vector machine with integrated data mining knowledge

中文摘要英文摘要

最优超球体支持向量机(SSLM)是一种典型的黑箱模型,其运行模式不需要考察被研究对象的内部结构和机理,仅利用对象的输入输出数据即能达到认识其功能和作用机制,因此具有响应快、实时性强等优点,但也因此缺乏可解释性和透明性.鉴于此,本文研究从SSLM黑箱模型的输入端加入先验知识的方法,增强其可解释性.本文开发了基于数据的非线性圆形知识挖掘算法以及知识的离散化算法,离散后的数据点不仅包含产生知识的原始数据点,还增加了新的数据点.通过将所挖掘的圆形知识以不等式约束的形式集成至SSLM模型,构造了可解释的SSLM模型(i-SSLM).该模型在训练时要确保知识约束的数据点分类正确,因此对模型结果有一定程度的预知,表明模型具有可解释性;同时,又由于知识的离散化增加了新的数据信息,因此,模型能具有更高的精度.i-SSLM模型的有效性在10组公共样本集和2组实际高炉数据集上得到了验证.

Small sphere and large margin support vector machine(SSLM)is a typical black box model,which works in no need of understanding the internal structure and mechanism of the object to be studied while only utilizes the input and output data for the purpose of knowing its function and interaction relation.Hence,the SSLM has the advantages of fast response and strong real-time performance,but accordingly lacks interpretability and transparency.In view of this,this paper examines ways to add prior knowledge into the input-port of the SSLM black box model to enhance its interpretability.We developed a nonlinear circular knowledge mining algorithm based on data as well as a discretization algorithm for knowledge,and the discrete data points contain not only the original data points that generated the knowledge,but also add new data points.By integrating the mined circular knowledge into the SSLM model in the form of inequality constraints,we construct an interpretable SSLM model(i-SSLM).When the model is trained,it is necessary to ensure that the data point classification of the knowledge constraint is correct,so there is a certain degree of prediction of the model results,indicating that the model is interpretable.At the same time,due to the discretization of knowledge to add new data information,the model can have higher accuracy.The validity of the i-SSLM model was verified on 10 sets of common sample sets and 2 sets of actual blast furnace datasets.

陆思洁;范頔;渐令;郜传厚

浙江大学数学科学学院,浙江杭州 310027中国石油大学(华东)经济管理学院,山东青岛 266580

黑箱模型可解释性最优超球体支持向量机先验知识不平衡数据

black box modelinterpretabilitysmall sphere and large margin support vector machineprior knowledgeunbalanced data

《控制理论与应用》 2024 (003)

化学反应网络系统的动力学行为分析及其应用

375-384 / 10

国家自然科学基金项目(12320101001,12071428,62111530247),浙江省自然科学基金重点项目(LZ20A010002)资助.Supported by the National Natural Science Foundation of China(12320101001,12071428,62111530247)and the National Natural Science Founda-tion of Zhejiang Province(LZ20A010002).

10.7641/CTA.2023.20832

评论