四川大学学报(自然科学版)2018,Vol.55Issue(3):469-476,8.DOI:10.3969/j.issn.0490-6756.2018.03.009
基于模糊哈希特征表示的恶意软件聚类方法
A malware variant clustering method based on fuzzy hash
摘要
Abstract
Internet Security companies collect tens of millions of new malware variants each year ,Virus Share ,the online malware repository ,has stored more than 27 million unlabeled malwares .Clustering malware variant according to certain behavior patterns ,not only makes the new attack easier to be detec-ted ,but also helps us to obtain the malware trends in time and take the corresponding preventive meas-ures .Therefore ,this paper proposes a malware variant clustering method which use dynamic analysis technology to extract malware features ,including import and export function name ,strings ,system re-source records and system calls ,then convert these features to the fuzzy hashes ,finally clustering mal-ware samples through the CFSFDP clustering algorithm .We select the number of clusters ,precision , recall ,F-score and entropy as external criteria ,select the intra-cluster cohesion and inter-cluster separa-tion as internal criteria .The experimental results demonstrate that compared with Symantec and ESET-NOD32 ,the F-score obtained in this paper increased by 11 .632% and 2 .41% ,and the number of clus-ters is closest to the artificial labeled .关键词
恶意软件家族/聚类/模糊哈希/特征提取Key words
Malware family/Clustering/Fuzzy Hash/Extract features分类
信息技术与安全科学引用本文复制引用
肖锦琦,王俊峰..基于模糊哈希特征表示的恶意软件聚类方法[J].四川大学学报(自然科学版),2018,55(3):469-476,8.基金项目
国家重点研究与发展项目(2016YFB0800605) (2016YFB0800605)
国家自然科学基金(91338107,91438119,91438120) (91338107,91438119,91438120)
国家教育部博士点基金(20130181110095) (20130181110095)
四川省重点科学技术研究发展项目(2016ZR0087) (2016ZR0087)