基于音视频信息融合与Self-Attention-DSC-CNN6网络的鲈鱼摄食强度分类方法OA北大核心
Classification Method of Feeding Intensity of Sea Bass Based on Self-Attention-DSC-CNN6 and Multi-modal Fusion
摄食强度识别分类是实现水产养殖精准投喂的重要环节.现有的投喂方式存在过度依赖人工经验判断、投喂量不精确、饲料浪费严重等问题.基于多模态融合的鱼类摄食程度分类能够综合不同类型的数据(如:视频、声音和水质参数),为鱼群的投喂提供更加全面精准的决策依据.因此,提出了一种融合视频和音频数据的多模态融合框架,旨在提升鲈鱼摄食强度分类性能.将预处理后的Mel频谱图(Mel Spectrogram)和视频帧图像分别输入到Self-Attention-DSC-CNN6(Self-attention-depthwise separable convolution-CNN6)优化模型进行高层次的特征提取,并将提取的特征进一步拼接融合,最后将拼接后的特征经分类器分类.针对Self-Attention-DSC-CNN6优化模型,基于CNN6算法进行了改进,将传统卷积层替换为深度可分离卷积(Depthwise separable convolution,DSC)来达到减少计算复杂度的效果,并引入Self-Attention注意力机制以增强特征提取能力.实验结果显示,本文所提出的多模态融合框架鲈鱼摄食强度分类准确率达到90.24%,模型可以有效利用不同数据源信息,提升了对复杂环境中鱼群行为的理解,增强了模型决策能力,确保了投喂策略的及时性与准确性,从而有效减少了饲料浪费.
Feeding intensity recognition and classification is an important link to realize accurate feeding in aquaculture.Existing feeding methods have problems such as over-reliance on manual experience judgment,imprecise feeding amount,and serious feed waste.Fish feeding degree classification based on multi-modal fusion can synthesize different types of data(e.g.,video,sound,and water quality parameters)to provide a more comprehensive and accurate decision basis for fish feeding.Therefore,a multi-modal fusion framework that integrated video and audio data was proposed with the aim of improving the performance of sea bass feeding intensity classification.The preprocessed Mel Spectrogram(Mel)and video frame images were input into the self-attention-depthwise separable convolution-CNN6(Self-Attention-DSC-CNN6)optimization model for high-level feature extraction,respectively,and the extracted features were further spliced and fused,and finally the spliced features were classified by a classifier.The Self-Attention-DSC-CNN6 optimization model was improved based on the CNN6 algorithm by replacing the traditional convolutional layers with depthwise separable convolution(DSC)to reduce the computational complexity,and the Self-Attention mechanism was introduced to enhance the feature extraction capability.The experimental results showed that the multi-modal fusion framework proposed achieved an accuracy of 90.24%in sea bass feeding intensity classification,and the model can effectively utilize the information from different data sources to improve the understanding of fish behavior in complex environments,enhance the decision-making ability of the model,and ensure the timeliness and accuracy of the feeding strategy,thus effectively reducing the waste of feed.This not only provided strong technical support for the intelligent management of aquaculture,but also laid the foundation for the development of intelligent feeding system.
李道亮;李万超;杜壮壮
中国农业大学信息与电气工程学院,北京 100083||国家数字渔业创新中心,北京 100083天津农学院计算机与信息工程学院,天津 300392中国农业大学信息与电气工程学院,北京 100083||国家数字渔业创新中心,北京 100083
水产学
鲈鱼摄食强度分类多模态融合Self-Attention-DSC-CNN6
sea bassclassification of feeding intensitymulti-modal fusionSelf-Attention-DSC-CNN6
《农业机械学报》 2025 (1)
16-24,9
国家重点研发计划项目(2022YFD2001703)
评论