南京航空航天大学学报(英文版)2025,Vol.42Issue(3):322-336,15.DOI:10.16356/j.1005-1120.2025.03.005
基于金字塔池化的视觉Transformer在刀具状态识别上的应用
Pyramid Pooling-Based Vision Transformer for Tool Condition Recognition
摘要
Abstract
This study focuses on tool condition recognition through data-driven approaches to enhance the intelligence level of computerized numerical control(CNC)machining processes and improve tool utilization efficiency.Traditional tool monitoring methods that rely on empirical knowledge or limited mathematical models struggle to adapt to complex and dynamic machining environments.To address this,we implement real-time tool condition recognition by introducing deep learning technology.Aiming to the insufficient recognition accuracy,we propose a pyramid pooling-based vision Transformer network(P2ViT-Net)method for tool condition recognition.Using images as input effectively mitigates the issue of low-dimensional signal features.We enhance the vision Transformer (ViT)framework for image classification by developing the P2ViT model and adapt it to tool condition recognition.Experimental results demonstrate that our improved P2ViT model achieves 94.4%recognition accuracy,showing a 10%improvement over conventional ViT and outperforming all comparative convolutional neural network models.关键词
刀具状态识别/Transformer/金字塔池化/深度卷积神经网络Key words
tool condition recognition/Transformer/pyramid pooling/deep convolutional neural network分类
信息技术与安全科学引用本文复制引用
郑堃,李永林,顾新艳,丁志颖,朱海华..基于金字塔池化的视觉Transformer在刀具状态识别上的应用[J].南京航空航天大学学报(英文版),2025,42(3):322-336,15.基金项目
This work was supported by China Postdoctoral Science Foundation(No.2024M754122),the Postdoctoral Fellowship Program of CPSF(No.GZB20240972),the Jiangsu Funding Program for Excellent Postdoctoral Talent(No.2024ZB194),Natural Science Foun-dation of Jiangsu Province(No.BK20241389),Basic Science Research Fund of China(No.JCKY2023203C026),and 2024 Jiangsu Province Talent Programme Qinglan Project. (No.2024M754122)