网络与信息安全学报2024,Vol.10Issue(1):112-122,11.DOI:10.11959/j.issn.2096-109x.2024008
基于LLM的多粒度口令分析研究
Research on multi-granularity password analysis based on LLM
摘要
Abstract
Password-based authentication has been widely used as the primary authentication mechanism.However,occasional large-scale password leaks have highlighted the vulnerability of passwords to risks such as guessing or theft.In recent years,research on password analysis using natural language processing techniques has progressed,treating passwords as a special form of natural language.Nevertheless,limited studies have investigated the impact of password text segmentation granularity on the effectiveness of password analysis with large language models.A multi-granularity password-analyzing framework was proposed based on a large language model,which follows the pre-training paradigm and autonomously learns prior knowledge of password distribution from large unlabelled da-tasets.The framework comprised three modules:the synchronization network,backbone network,and tail network.The synchronization network module implemented char-level,template-level,and chunk-level password segmenta-tion,extracting knowledge on character distribution,structure,word chunk composition,and other password features.The backbone network module constructed a generic password model to learn the rules governing password compo-sition.The tail network module generated candidate passwords for guessing and analyzing target databases.Experi-mental evaluations were conducted on eight password databases including Tianya and Twitter,analyzing and sum-marizing the effectiveness of the proposed framework under different language environments and word segmenta-tion granularities.The results indicate that in Chinese user scenarios,the performance of the password-analyzing framework based on char-level and chunk-level segmentation is comparable,and significantly superior to the framework based on template-level segmentation.In English user scenarios,the framework based on chunk-level segmentation demonstrates the best password-analyzing performance.关键词
大语言模型/口令分析/自然语言处理/分词Key words
large language model/password analysis/natural language processing/word segmentation分类
信息技术与安全科学引用本文复制引用
洪萌,邱卫东,王杨德..基于LLM的多粒度口令分析研究[J].网络与信息安全学报,2024,10(1):112-122,11.基金项目
国家自然科学基金(61972249) (61972249)
国家重点研发计划(2023YFB3106501)The National Natural Science Foundation of China(61972249),The National Key R&D Program of China(2023YFB3106501) (2023YFB3106501)