Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge DistillationOACSTPCDEI
Two-Stage Approach for Targeted Knowledge Transfer in Self-Knowledge Distillation
Knowledge distillation(KD)enhances student net-work generalization by transferring dark knowledge from a com-plex teacher network.To optimize computational expenditure and memory utilization,self-knowledge distillation(SKD)extracts dark knowledge from the model itself rather than an external teacher network.However,previous SKD methods performed distillation indiscriminately on full datasets,overlooking the anal-ysis of representative samples.In this work,we…查看全部>>
Zimo Yin;Jian Pu;Yijie Zhou;Xiangyang Xue
School of Computer Science,Fudan University,Shanghai 200433,ChinaInstitute of Science and Technology for BrainInspired Intelligence,Fudan University,Shanghai 200433,China
Cluster-based regularizationiterative prediction refinementmodel-agnostic frameworkself-knowledge distillation(SKD)two-stage knowledge transfer
《自动化学报(英文版)》 2024 (011)
2270-2283 / 14
This work was supported by the National Natural Science Foundation of China(62176061).
评论