| 注册
首页|期刊导航|西安交通大学学报(社会科学版)|大语言模型安全的技术治理:对抗测试与评估审计

大语言模型安全的技术治理:对抗测试与评估审计

周辉 郭烘佑

西安交通大学学报(社会科学版)2025,Vol.45Issue(2):78-88,11.
西安交通大学学报(社会科学版)2025,Vol.45Issue(2):78-88,11.DOI:10.15896/j.xjtuskxb.202502007

大语言模型安全的技术治理:对抗测试与评估审计

Technical Governance of Large Language Models Security:Red Teaming and Evaluation Audits

周辉 1郭烘佑2

作者信息

  • 1. 中国社会科学院 法学研究所,北京 100720
  • 2. 中国社会科学院大学 法学院,北京 102488
  • 折叠

摘要

Abstract

While providing the ability to generalize across tasks and domains,AI large language models have triggered multiple risks due to their huge data requirements and complex technical architectures,which not only increase the security threats faced by enterprises and individuals,but also bring about a series of ethical and legal issues. To effectively identify and mitigate these security vulnerabilities and risks,adversarial testing and assessment auditing,as the core of technical governance,provide critical safeguards for the secure application of large language models.Potential security threats and vulnerabilities can be detected at an early stage,thus preventing losses and risks.However,the implementation of these technical governance measures currently faces multiple difficulties,including insufficient computational resources,lack of uniformity in technical governance processes and standards,and the vulnerability of platform technical governance to commercial interests.These factors not only hinder the effective implementation of technical governance programs,but also limit the widespread application of security measures. Therefore,it is necessary to optimize the technical governance framework from multiple perspectives,starting with improving the effectiveness of security governance by encouraging technological innovation,for example,by developing more advanced algorithms and security technologies to strengthen the defensive capabilities of large language models.In addition,it is crucial to clarify the processes and standards of technical governance,which should be based on a comprehensive assessment of risk and compliance needs to ensure the legitimacy and ethics of technology applications.Establishing a multi-party oversight mechanism is also key to improving security technical governance,including broad participation from government,industry organizations,academia and the public.Through these comprehensive measures,the security and stability of large language models can be effectively enhanced to ensure stable and secure operation.

关键词

人工智能/大语言模型/安全风险/技术治理/对抗测试/评估审计

Key words

artificial intelligence/large language models/security risk/technical governance/red teaming/evaluation audits

分类

计算机与自动化

引用本文复制引用

周辉,郭烘佑..大语言模型安全的技术治理:对抗测试与评估审计[J].西安交通大学学报(社会科学版),2025,45(2):78-88,11.

基金项目

中国社会科学院学科建设"登峰战略"资助计划项目(DF2023XXJC07). (DF2023XXJC07)

西安交通大学学报(社会科学版)

OA北大核心

1008-245X

访问量0
|
下载量0
段落导航相关论文