计算机应用研究2024,Vol.41Issue(6):1693-1698,6.DOI:10.19734/j.issn.1001-3695.2023.10.0496
基于空间注意力图的知识蒸馏算法
Knowledge distillation algorithm based on spatial attention map
摘要
Abstract
Knowledge distillation algorithms have a great effect on the streamlining of deep neural networks.The current fea-ture-based knowledge distillation algorithms either focus on a single part for improvement and ignore other beneficial parts,or provides effective guidance for the part that a small model should focus on,which makes the distillation effect insufficient.In order to make full use of the beneficial information of the large model and process it to improve the knowledge conversion rate of the small model,this paper proposed a new distillation algorithm.Firstly,it used the conditional probability distribution to fit the feature spatial distribution of the large model's middle layer,and then extracted the spatial attention maps that tended to be similar after fitting together with other beneficial information.Finally,it used the small convolutional layer,narrowed the gap between models,transmitted the transformed information to the small model to achieve distillation.Experimental results show that the algorithm has the applicability of multiple teacher-student combinations and the generality of multiple data sets,and compared with the current more advanced distillation algorithms,the performance is improved by about 1.19%and the time is shortened by 0.16 h.It has important engineering significance and wide application prospects for large networks'opti-mization and the application of deep learning on low-resource devices.关键词
知识蒸馏/知识迁移/模型压缩/深度学习/图像分类Key words
knowledge distillation/knowledge transfer/model compression/deep learning/image classification分类
信息技术与安全科学引用本文复制引用
王礼乐,刘渊..基于空间注意力图的知识蒸馏算法[J].计算机应用研究,2024,41(6):1693-1698,6.基金项目
国家自然科学基金资助项目(61972182) (61972182)