基于二重语义相关性图卷积网络的跨模态检索方法OA北大核心CSTPCD
Dual semantic correlation graph convolutional networks for cross-modal retrieval
随着深度神经网络的不断发展,跨模态检索模型的构建也随之取得了长足的进步.以图卷积网络(GCN)为基础的跨模态检索方法可以较好地捕获数据的语义相关性,因此越来越受到人们的关注.但是,目前大部分研究多将标签之间和样本之间的相关性融入到跨模态表示当中,并没有考虑到标签集合之间的相关性对于跨模态检索模型性能的影响.在多标签场景下,标签集合之间的多标签相关性可以有效地描述对应样本之间的语义关系,因此充分发现多标签相关性并将其融入到跨模态表示中,对于提高跨模态检索模型的性能有着重要的意义.提出了一种基于二重语义相关性图卷积网络(dual semantic correlation graph convolutional networks,DSCGCN)的跨模态检索方法,该方法利用GCN自适应地发现标签之间和多标签之间的语义相关性,并将此二重语义相关性融入到样本公共表示中.此外,还提出了一种多标签相似性损失,用于使生成的样本公共表示相似性更接近于语义相似性.通过在NUS-WIDE、MIRFlickr-25K和MS-COCO三个数据集上的实验可以发现,由于引入了多标签语义相关性,DSCGCN可以获得令人满意的检索效果.
With the continuous development of deep neural networks,significant progress has been made in the construction of cross-modal retrieval models.Cross-modal retrieval methods based on GCN have shown promising results in capturing semantic correlations in data,thus attracting increasing attention.However,most recent research focuses on incorporating correlations between labels and between samples into cross-modal representations,while the impact of correlations between label sets is neglected.In multi-label scenarios,the correlations between label sets can effectively describe semantic relationships between corresponding samples.Therefore,exploring the multi-label correlations and integrating it into cross-modal representations is important for improving the performance of cross-modal retrieval models.This paper proposed a dual semantic correlation graph convolutional networks(DSCGCN)cross-modal retrieval method.This method utilized GCN to explore the semantic correla-tions between labels and between multi-labels adaptively,and integrated the learned dual semantic correlations into the com-mon representations.Additionally,it designed a multi-label similarity loss to make the similarities between the common repre-sentations more close to the semantic similarities.Experimental results on the NUS-WIDE,MIRFlickr-25 K,and MS-COCO datasets demonstrate that because of multi-label correlations,DSCGCN achieves satisfactory retrieval performance.
刘佳楠;范晶晶;赵建光;朱杰
河北建筑工程学院信息工程学院,河北张家口 075000河北大学数学与信息科学学院,河北保定 071002
计算机与自动化
语义相关性自适应相关性矩阵图卷积网络跨模态检索
semantic correlationadaptive correlation matrixgraph convolutional network(GCN)cross-modal retrieval
《计算机应用研究》 2024 (004)
1239-1246 / 8
河北省自然科学基金资助项目(F2022511001);河北省高等学校科学技术研究项目(ZC2022070);河北大学高层次人才科研启动项目(521100223212);张家口市市级科技计划财政资助项目(2311010A);张家口市2022年度基础研究专项资助项目(2221008A);河北建筑工程学院2024年校级研究生创新基金资助项目(XY2024068)
评论