计算机工程2025,Vol.51Issue(2):312-321,10.DOI:10.19678/j.issn.1000-3428.0068905
基于改进Vision Transformer的局部光照一致性估计
Estimation of Local Illumination Consistency Based on Improved Vision Transformer
摘要
Abstract
Illumination consistency is a key factor in achieving the organic fusion of virtual and real elements in Augmented Reality(AR)systems.Owing to the constraints of capture perspectives and the complexity of scene illumination,developers often overlook local illumination consistency when estimating panoramic lighting information,thereby affecting the final rendering quality.To address this issue,this study proposes a local illumination consistency estimation framework,ViTLight,based on an improved Vision Transformer(ViT)structure.First,the framework uses a ViT encoder to extract feature vectors and calculate regression Spherical Harmonic(SH)coefficients,then recovers illumination information.Second,the ViT encoder structure is enhanced by introducing a multi-head self-attention interaction mechanism.Convolution operation guides the interplay between attention heads.Additionally,a local perception module is integrated to actively scan each image block and perform weighted summation on local pixels to capture specific features within regions.This proactive approach balances global contextual features and local illumination information,ultimately improving the precision of illumination estimation.The mainstream feature extraction network and four classical illumination estimation frameworks are compared on public datasets.The experimental results and analysis indicate that ViTLight is superior to existing frameworks in terms of image rendering accuracy,and its Root Mean Square Error(RMSE)and Structural Dissimilarity(DSSIM)index reach 0.129 6 and 0.042 6,respectively,which verifies its effectiveness and correctness.关键词
增强现实/光照估计/球面谐波系数/视觉Transformer/多头自注意力Key words
Augmented Reality(AR)/illumination estimation/Spherical Harmonics(SH)coefficient/Vision Transformer(ViT)/multi-head self-attention分类
计算机与自动化引用本文复制引用
王杨,宋世佳,王鹤琴,袁振羽,赵立军,吴其林..基于改进Vision Transformer的局部光照一致性估计[J].计算机工程,2025,51(2):312-321,10.基金项目
国家自然科学基金(61871412) (61871412)
安徽省自然科学基金重点项目(KJ2019A0938,KJ2021A1314,KJ2019A0979) (KJ2019A0938,KJ2021A1314,KJ2019A0979)
安徽高校自然科学重点项目(2022AH052899,KJ2019A0979,KJ2019A0511,2023AH052757) (2022AH052899,KJ2019A0979,KJ2019A0511,2023AH052757)
机器视觉检测安徽省重点实验室开放课题(KLMVI-2023-HIT-11) (KLMVI-2023-HIT-11)
安徽省高校学科(专业)拔尖人才学术项目(gxbjZD2022147). (专业)