智能城市2026,Vol.12Issue(2):97-100,4.DOI:10.19301/j.cnki.zncs.2026.02.021
基于TrOCR的中文手写文本识别研究
Research on Chinese handwritten text recognition based on TrOCR
郑栋方 1拥措1
作者信息
- 1. 西藏大学信息科学技术学院,西藏 拉萨 850000||西藏自治区藏文信息技术人工智能重点实验室,西藏 拉萨 850000||藏文信息技术教育部工程研究中心,西藏 拉萨 850000
- 折叠
摘要
Abstract
Chinese handwritten text recognition has important application value in document digitization.TrOCR,as an end-to-end recognition model based on Transformer,performs well in handwritten text recognition tasks,but it cannot directly process Chinese characters using an English vocabulary.The article focuses on the Chinese handwriting recognition task and reconstructs the decoder Chinese vocabulary on the CASIA-HWDB2 dataset to adapt TrOCR to the Chinese recognition task while retaining the feature extraction ability of the pre-trained visual encoder.On this basis,carry out comparative experiments to systematically evaluate the impact of data augmentation and Dropout regularization on performance.The experimental results show that based on the baseline of pre-trained model character accuracy of 98.11%and sequence accuracy of 81.20%,after introducing data augmentation and Dropout regularization,the final model character accuracy reaches 98.89%and sequence accuracy reaches 84.80%.The research results provide a basis for TrOCR adaptation and training strategy optimization in Chinese handwritten text recognition.关键词
手写文本识别/TrOCR/数据增强/正则化Key words
handwritten text recognition/TrOCR/data augmentation/regularization分类
信息技术与安全科学引用本文复制引用
郑栋方,拥措..基于TrOCR的中文手写文本识别研究[J].智能城市,2026,12(2):97-100,4.