|国家科技期刊平台
首页|期刊导航|现代电子技术|视觉-语言多模态下的多任务人脸年龄估计

视觉-语言多模态下的多任务人脸年龄估计OA北大核心CSTPCD

Multi-task face age estimation in vision-language multimodality

中文摘要英文摘要

现有的年龄估计方法仅基于人脸图像,无法充分利用图像背后的语言背景信息.此外,这些方法通常专注于单一年龄估计任务的优化,忽略了相近任务带来的提高模型性能的信息.针对上述问题,提出一种基于视觉-语言多模态的多任务人脸年龄估计方法.该方法利用提示文本信息为年龄估计提供更丰富、更准确的图像理解和先验知识.同时,引入多任务学习方法,利用任务间的互补性将年龄分类任务与序数回归任务相结合,以获得更好的性能.最后,为了获得可靠的预测结果,研究了加权平均法和任务回归法两种多任务结果融合方法,并对加权平均法的权重系数进行了消融实验,以期找到一组合适的权重系数.结果表明:与其他先进方法相比,所提方法在UTK-FACE数据集上的平均绝对误差(MAE)降低了7.32%,在Morph Ⅱ数据集上的MAE降低了1.20%,累积分数(CS)提升了0.11%.

Existing age estimation methods are based only on face images and cannot fully utilize the linguistic contextual information behind the images.In addition,these methods usually focus on the optimization of a single age estimation task,ignoring the information brought by similar tasks to improve the model performance.To address the above problems,a multi-task face age estimation method based on vision-language multimodality is proposed,which utilizes prompt text information to provide richer and more accurate image understanding and a priori knowledge for age estimation.Meanwhile,a multi-task learning method is introduced to combine the age classification task with the ordinal regression task by utilizing the complementarity between tasks to obtain better performance.In order to obtain reliable prediction results,two multi-task result fusion methods are investigated:weighted averaging and task regression,and ablation experiments are conducted on the weighting factor of the weighted averaging method to find a suitable set of weighting factors.In comparison with the state-of-the-art methods,the mean absolute error(MAE)of the proposed method is reduced by 7.32%on the UTK-FACE dataset,its MAE is reduced by 1.20%,and its cumulative score(CS)is improved by 0.11%on the Morph Ⅱ dataset.

何江;池静;池佳稷;高松

河北工程大学 信息与电气工程学院,河北 邯郸 056038拉彭兰塔理工大学 电气工程学院,南卡累利亚 拉彭兰塔 53850邯郸市第三建筑工程有限公司,河北 邯郸 056001

电子信息工程

年龄估计视觉-语言多模态多任务学习加权平均法提示文本任务回归器

age estimationvisual-language multimodalitymultitask learningweighted average methodprompt texttask regressor

《现代电子技术》 2024 (014)

171-176 / 6

邯郸市科学技术研究与发展计划项目(21422031252)

10.16652/j.issn.1004-373x.2024.14.026

评论