首页|期刊导航|计算机工程与应用|生成式大模型越狱攻击安全性研究综述

生成式大模型越狱攻击安全性研究综述

李燕王钢王浩

计算机工程与应用2026，Vol.62Issue(6)：27-50,24.

计算机工程与应用2026，Vol.62Issue(6)：27-50,24.DOI:10.3778/j.issn.1002-8331.2509-0303

生成式大模型越狱攻击安全性研究综述

Overview of Security Research on Jailbreak Attacks Against Generative Large Models

李燕 ¹王钢 ¹王浩¹

作者信息

1. 内蒙古工业大学智能科学与技术学院(网络空间安全学院),呼和浩特 010080
折叠

摘要

Abstract

In recent years,generative large models have been widely used in a variety of key scenarios,including text gen-eration,conversational interaction,and content creation.However,jailbreak attacks are emerging as a threat to these models.Jailbreak attacks can bypass their built-in security mechanisms and induce them to produce harmful outputs,posing security challenges such as ethical risks,privacy leaks,and model abuse.To address this threat,this paper comprehensively reviews recent research progress on jailbreak attacks against two mainstream generative models:large language models and multimodal large language models.This review focuses on three aspects:jailbreak attack types,defense strategies,and security assessment frameworks.It details the basic principles,implementation methods,and research conclusions of current jailbreak attack methods,providing valuable insights for future research.Building on this work,this paper further summarizes the current deficiencies in jailbreak security research for these two mainstream generative models and identi-fies key challenges and development opportunities for future research on the security of generative large models.This review provides guidance for researchers working on the complex applications and security of generative large models.

关键词

生成式大模型(GLMs)/越狱攻击/安全挑战/防御策略/安全性研究

Key words

generative large models(GLMs)/jailbreak attack/security challenge/defense strategy/security research

分类

信息技术与安全科学

引用本文复制引用

李燕,王钢,王浩..生成式大模型越狱攻击安全性研究综述[J].计算机工程与应用,2026,62(6):27-50,24.

基金项目

内蒙古自治区高校网络安全和教育管理信息化工程研究中心专项(RZ2200000611). （RZ2200000611）

计算机工程与应用

ISSN：1002-8331

访问量0

下载量0

段落导航