基于域特定特征的CLIP提示优化算法OA北大核心CSTPCD
CLIP prompt optimization algorithm based on domain-specific feature
当测试数据与训练数据遵循不同的分布时,神经网络会经历领域转移.领域泛化(DG)的目标是学习一个可处理未知域的通用模型,以此来解决这个问题.以往的方法通过数据增强或者特征空间对齐的方式来提取域不变特征,但在提取的过程中又会产生新的域特定特征,导致模型泛化的性能较差.针对这些问题,提出一个简单而有效的框架——ERCLIP,通过ERCLIP来实现大规模预训练模型CLIP在DG中的应用.ERCLIP通过主动提取域特定特征,并将其融入文本提示,实现图像语义的精准描述.并且提出一个文本提示优化器,动态地优化提示向量.在公开数据集OfficeHome、VLCS与PACS上的实验结果表明,ERCLIP在OfficeHome上的平均准确率为83.4%,在VLCS上为83.5%,在PACS上为96.5%,在所有算法里取得最优结果.
When the testing data and training data follow different distributions,the neural network can undergo domain shift.The goal of domain generalization(DG)is to solve this problem by learning a general model that can handle unknown domains.Previous methods can extract domain-invariant features by means of data enhancement or feature space alignment,but new domain-specific features can be generated in the process of extraction,resulting in poor model generalization performance.On this basis,a simple and effective framework ERCLIP(extracting and removing domain-specific features for CLIP)is proposed to realize the application of large-scale pre-training model CLIP in DG.ERCLIP can realize precise semantic description of images by actively extracting domain specific features and incorporating them into text prompts.The experimental results on the public datasets OfficeHome,VLCS,and PACS show that ERCLIP can realize the best results among all algorithms,with an average accuracy of 83.4%on OfficeHome,83.5%on VLCS,and 96.5%on PACS.
张跃文;王九杭;覃荣华
中国科学院上海微系统与信息技术研究所,上海 201800||中国科学院大学,北京 100049中国科学院上海微系统与信息技术研究所,上海 201800
电子信息工程
域不变特征ERCLIP领域泛化神经网络特征提取文本提示
domain-invariant featureERCLIPdomain generalizationneural networkfeature extractiontext prompt
《现代电子技术》 2024 (018)
41-46 / 6
评论