福建师范大学学报(自然科学版)2026,Vol.42Issue(2):1-10,10.DOI:10.12046/j.issn.1000-5277.2025040012
基于所有者身份识别的自然语言处理模型水印算法
Owner Identity-Based Watermarking Algorithm for Natural Language Processing Models
摘要
Abstract
In order to ensure that model developers and data providers can effectively protect their intellectual assets,this paper proposes a watermarking framework for Natural Language Pro-cessing NLP)models based on owner identity OIRW)to address the security and robustness chal-lenges of model copyright protection.Specifically,during the model training stage,a digital signa-ture is generated using the user's copyright information and a secret key to link the trigger,which is then inserted into the dataset to train the watermarked model.In the model verification stage,the owner's identity is first verified via the digital signature,copyright information,etc.;subsequently the triggered test data is fed into the remote model to obtain the watermark verification result.To e-valuate the watermarking performance,watermark information was embedded into three common lan-guage models on the AgNews and SST-2 datasets.The experimental results show that the watermark verification accuracy is close to 100%,demonstrating strong robustness under attack scenarios such as model fine-tuning,pruning,and overwriting.关键词
自然语言处理模型/版权保护/模型水印/鲁棒性Key words
natural language processing model/copyright protection/model watermarking/robustness分类
信息技术与安全科学引用本文复制引用
方静,宋考,蔡娟娟,金彪,熊金波..基于所有者身份识别的自然语言处理模型水印算法[J].福建师范大学学报(自然科学版),2026,42(2):1-10,10.基金项目
国家自然科学基金项目(62272102、62272103、62202102) (62272102、62272103、62202102)
福建省自然科学基金重点项目(2023J02014) (2023J02014)
福建省自然科学基金项目(2023J01531) (2023J01531)
福建省中青年教师教育科研项目(JAT220045) (JAT220045)