基于双流增强编码和注意优化解码的图像篡改定位算法OA北大核心CSTPCD
Dual-stream enhancement encoder and attention optimization decoder for image manipulation localization
主流图像篡改定位方法通常通过简单操作融合不同流的不一致特征,导致特征冗余且篡改区域的像素误检.基于此,本文提出基于双流增强编码和注意优化解码的图像篡改定位框架.首先,提出双流增强编码分别对图像和频域特征进行基于噪声和通道注意力的自增强和基于特征映射的交叉注意权重的交互增强.随后,引入多级感受野策略探索多尺度上下文信息,设计邻阶特征聚合模块融合多尺度相邻特征.最后,利用篡改区域和非篡改区域协同增强模型的篡改定位能力,提出注意优化解码模块,消除初始篡改区域预测中边缘像素的错误预测,逐步精确细化篡改定位.在4个主流公共基准数据集NIST16、Coverage、Columbia、CASIA和两个现实挑战数据集IMD20、Wild上与主流篡改定位方法进行对比,本文算法在无微调模型和微调模型两个设置下,在6个数据集上的性能最优,证明本文提出的篡改定位网络能够充分利用多种篡改线索,在不同的篡改数据集上实现篡改区域的有效定位,具有更高的定位精度和更强的鲁棒性.
Mainstream image manipulation localization methods usually fuse inconsistent features of different streams through simple operations,resulting in feature redundancy and pixel misdetection of tampered regions.Therefore,we propose a novel network of dual-stream enhancement encoder and attention optimization decoder for image manipulation localization.Firstly,the dual-stream enhancement encoder module can self-reinforce and interact with the extracted dual-stream multi-scale features,and can make full use of a variety of tampered information,so that a variety of tampered information can be complemented by interaction,and more attention is paid to the tampering features.Then,a multi-scale receptive field strategy is introduced to explore multi-scale context information,and an adjacent-level feature aggregation module is designed to fuse multi-scale adjacent features.Finally,the capability of manipulation localization is enhanced with the cooperation of tamper region and genuine region,the attention optimization decoder module is designed to eliminate the wrong prediction of edge pixels in the initial tamper region prediction,and the manipulation localization is refined step by step.Extensive experiments are constructed on four mainstream public datasets,NIST16,Coverage,Columbia and CASIA,and two realistic challenge datasets,IMD20 and Wild,to compare with mainstream manipulation localization methods.Our proposed method has superior performance under six datasets in the settings of none fine-tuning and fine-tuning model,which demonstrates that our proposed method can make full use of various forgery clues to achieve greater localization accuracy and stronger robustness.
朱叶;赵晓祥;于洋
河北工业大学 人工智能与数据科学学院,天津 300401
计算机与自动化
图像篡改定位双流增强编码注意优化解码邻阶特征聚合
image manipulation localizationdual-stream enhancement encoderattention optimization decoderadjacent feature aggregation module
《液晶与显示》 2024 (008)
1103-1115 / 13
国家自然科学基金青年项目(No.62102129);河北省自然科学基金(No.F2021202030);石家庄市驻冀高校基础研究项目(No.241790817A)Supported by Youth Program of National Natural Science Foundation of China(No.62102129);Natural Science Foundation of Hebei Province(No.F2021202030);Basic Research Project of Hebei Universities in Shijiazhuang(No.241790817A)
评论