|国家科技期刊平台
首页|期刊导航|计算机工程与应用|融合Transformer和CNN的轻量级人脸识别算法

融合Transformer和CNN的轻量级人脸识别算法OA北大核心CSTPCD

Lightweight Face Recognition Algorithm Combining Transformer and CNN

中文摘要英文摘要

随着深度学习的发展,卷积神经网络通过堆叠卷积层逐步扩大感受野以融合局部特征的方式已经成为人脸识别(FR)的主流方法,但这种方法存在因忽略人脸全局语义信息和缺乏对人脸重点特征信息的关注造成识别准确率不高,以及大参数量层数的堆叠导致网络难以部署于资源受限设备的问题.因此提出一种融合Transformer和CNN的极其轻量级FR算法gcsamTfaceNet.使用深度可分离卷积构建主干网络以降低算法的参数量;引入通道-空间注意力机制,从通道和空间两个域最优化选择特征以提高对人脸重点区域的关注度;在此基础上,融合Transformer模块以捕获特征图的全局语义信息,克服卷积神经网络在长距离语义依赖性建模方面的局限性,提高算法的全局特征感知 能力.参数量仅为 6.5 × 105 的 gcsamTfaceNet 在 9 个验证集(LFW、CA-LFW、CP-LFW、CFP-FP、CFP-FF、AgeDB-30、VGG2-FP、IJB-B 以及 IJB-C)上实验评估,分别取得 99.67%、95.60%、89.32%、93.67%、99.65%、96.35%、93.36%、89.43%和91.38%的平均准确率,达到参数量和性能之间较好的权衡.

With the development of deep learning,convolutional neural networks have become the mainstream approach for face recognition(FR)by gradually expanding the receptive field through stacking convolutional layers to integrate local features.However,this approach suffers from the drawbacks of neglecting global semantic information of faces and lacking attention to important facial features,resulting in low recognition accuracy.Additionally,the stacking of a large number of parameters and layers poses challenges for deploying the network on resource-constrained devices.Therefore,a highly lightweight face recognition algorithm called gcsamTfaceNet is proposed,which combines Transformer and CNN.Firstly,a depthwise separable convolution is used to construct the backbone network in order to reduce the parameter count of the algorithm.Secondly,a channel-spatial attention mechanism is introduced to optimize the selection of features in both the channel and spatial domains,thereby improving the attention given to important facial regions.Building upon this,the Transformer module is integrated to capture the global semantic information of the feature maps,overcoming the limitations of convolutional neural networks in modeling long-range semantic dependencies and enhancing the algorithm's ability to perceive global features.The gcsamTfaceNet,with a parameter count of only 6.5 × 105,is evaluated on nine validation datasets including LFW,CA-LFW,CP-LFW,CFP-FP,CFP-FF,AgeDB-30,VGG2-FP,IJB-B,and IJB-C.It achieves average accuracies of 99.67%,95.60%,89.32%,93.67%,99.65%,96.35%,93.36%,89.43%,and 91.38%on these datasets,respectively.This demonstrates a good balance between parameter count and performance.

李明;党青霞

武汉纺织大学 湖北省服装信息化工程技术研究中心,武汉 430200||武汉纺织大学 湖北省数字化纺织装备重点实验室,武汉 430200武汉纺织大学 湖北省服装信息化工程技术研究中心,武汉 430200

计算机与自动化

轻量级人脸识别卷积神经网络Transformer注意力机制

lightweight face recognitionconvolutional neural networkTransformerattention mechanism

《计算机工程与应用》 2024 (014)

96-104 / 9

湖北省数字化纺织装备重点实验室开放基金(DTL2018021);湖北省服装信息化工程技术研究中心开放基金(184084004).

10.3778/j.issn.1002-8331.2311-0276

评论