Abstract
Objective The aircraft ground maintenance job card is the essential reference for maintenance operations and records.It requires handwritten im-age identification and digital storage.Due to the limitations of maintenance rules or manuals in civil aviation,a mixture of Chinese and English words often forms complex sentences on the same job card,which creates difficulties in word character extraction and reduces recognition preci-sion.This study applies a new method of multi-branch convolution,the Re-parameterized and Multi-branch Convolution Algorithm(RMCA),to enhance the recognition of complex structures and similar words,improving mean average precision(MAP)and identification efficiency.This study addresses several problems in handwritten word identification.First,the number of layers in the deep convolution network affects the re-sults of characteristic extraction.Second,features extracted from different layers represent varying dimensions in the feature matrix.Third,Chi-nese words demonstrate varying levels of complexity.
Methods The identification precision index is defined as the mean average precision of Top1 and Top5,and the identification efficiency index is expressed as memory access cost(MAC)to evaluate the proposed model.However,calculating MAC during the model training process presents challenges.Therefore,MAC is replaced with the number of processed image pieces per second.The improved RMCA algorithm utilizes the strengths of deep convolution to extract image characteristics related to boundaries and fine details.Deep convolution layers are known for learn-ing features at different abstraction levels,while lower layers capture more localized details.In addition,the dimension of the convolutional ker-nel influences the receptive field and local features within a certain layer.The core of the handwritten word identification model lies in the added convolutional channels and layers,along with an adaptive identification algorithm designed for identical and similar words in handwritten im-ages.Higher identification precision and efficiency serve as reference indices for evaluating the model.The improved RMCA algorithm applies four branches in the initial layers,which differs from the original re-parameterized structure.The kernel sizes in the four branches are set to 11 and 77,equivalent to a variable dimension kernel of 77.The following of a fully connected layer can cause the loss of boundary or specific layer features,making it challenging to meet identification requirements for words with complex or simple structures.Hence,the improved RMCA al-gorithm utilizes spatial features.The fully connected layer is replaced with fully convolutional layers,and the spatial features from the fourth layer are passed to the classifier.This design enables the improved model to adapt automatically to various word structure complexities.The im-proved model comprises four functional components.The innovation of this study lies in several aspects.First,the enhanced re-parameterized structure across multiple stages and branches achieves an effect equivalent to variable convolution.Second,the refined classifier with fully convo-lutional layers combines features from specific intermediate layers with the output layer,resulting in improved precision for complex and similar words.The feature extraction performance is enhanced.Compared to feature outputs from the fourth and fifth layers of traditional models,the conclusion is confirmed.The simple Chinese word characteristics in the fourth layer are more abundant than in the fifth,whereas complex hand-written Chinese words contain similarly detailed features across both layers.
Results and Discussions The training image datasets for the experiment consist of two groups.Group 1 includes the HWDB1.0-1.1,comprising 3 755 classes of Chinese words,totaling 2.68 million images.The test dataset is the ICDAR-2013,containing 224 thousand Chinese word im-ages.Group 2 extends Group 1 by incorporating English uppercase and lowercase letters(52 classes)and digits from 0 to 9(10 classes).The test dataset expands the ICDAR-2013 with additional images of English letters and digits.The experimental results presented in this study demon-strate improvements in evaluation indices compared to other models,including the pre-improvement model.The experiments are divided into two categories:ablation and comparative.The results of the ablation experiment indicate that the Top1 and Top5 precision indices improved to 97.50%and 99.79%,respectively.Specifically,altering the 77 kernel in the first layer increases precision by 0.3%,while modifying the classifier results in a 0.6%gain.Group 2 achieves a Top1 recognition accuracy of 97.23%.The results of the comparative experiment,based on ten tradi-tional models,showed that the proposed model occupies 69.1 MB of storage,slightly more than the 48.34 MB of the lightweight MobileNetV2.However,the model achieves superior precision(97.50%)and identification speed(1 410 cards per second).Compared to the original Rep-VGG and ResNet50,the improved model increases precision by 6.90%and 8.43%,respectively.Identification speed improves by 8.8%and 17.2%,respectively.
Conclusions These results confirm that the proposed method enhances word recognition precision and efficiency.Experiments involving similar word identification yield consistent findings.In the long term,the improved model proves applicable in the field of aircraft maintenance job cards and other specialized areas requiring handwritten word identification.关键词
脱机手写汉字识别/全卷积/重参数化结构/空间特征融合/重参数化多分支卷积算法Key words
offline handwritten Chinese character recognition/fully convolutional network/re-parameterized structure/spatial feature fusion/re-parameterized multi-branch convolutional algorithm分类
计算机与自动化