计算机技术与发展2024,Vol.34Issue(10):69-76,8.DOI:10.20165/j.cnki.ISSN1673-629X.2024.0197
DCT-YOLOv5:从频率角度设计目标检测算法
DCT-YOLOv5:Designing Object Detection Algorithms from a Frequency Perspective
摘要
Abstract
Discrete cosine transform(DCT)is one of the core steps of JPEG compression algorithm,which converts pixel data in the spatial domain of image into coefficients in the frequency domain.Algorithms that combine DCT with deep learning are largely common,but do not resolve the convolutional structures from the frequency perspective.To further improve the performance of object detection,we propose an improved algorithm for this problem:DCT-YOLOv5.First,it is shown that convolutional neural networks(CNNs),Transformers,and MLP architectures all implicitly model the frequency domain,validating previous standard model design principles:the effective perceptual field is always smaller than the theoretical perceptual field,and multiple small convolutional kernel is preferred to a large convolutional kernel.Second,the input channels and the convolution kernel are considered to choose a reasonable number of output channels to achieve an approximate lossless transformation,where the only place to change the number of channels is at the down-sampling stage.Finally,by comparing DCT and convolution with fixed parameters,the difference between the two is stabilized within±0.8%.And to minimize the computation,grouped convolution with a fixed number of in-groups is introduced.The model is benchmarked with YOLOv5,and enriched experiments are designed on the COCO2017 dataset to validate the effectiveness of the proposed method.Theresultshowsa detection speed of 277.8 FPS and a mAP@.5 of 28.9%,achieving a relative improvement of 1.3%over the benchmark model.The test results indicate that the enhanced model has significantly improved accuracy and can operate on lower computing platforms.关键词
离散余弦变换/卷积神经网络/下采样/固定参数/YOLOv5Key words
discrete cosine transform/convolutional neural networks/down-sampling/fixed parameter/YOLOv5分类
信息技术与安全科学引用本文复制引用
王涛,张笃振..DCT-YOLOv5:从频率角度设计目标检测算法[J].计算机技术与发展,2024,34(10):69-76,8.基金项目
江苏省高等学校自然科学研究面上项目(19KJB520032) (19KJB520032)