数字图书馆论坛2025,Vol.21Issue(10):12-22,11.DOI:10.3772/j.issn.1673-2286.2025.10.002
基于改进图像多标签分类的古籍异体字部件识别研究
Component Recognition of Variant Characters in Ancient Books Based on Improved Image Multi-Label Classification
摘要
Abstract
In response to the problem that many variant characters in ancient books are stored in images and cannot be directly retrieved digitally,this study proposes an ECA-HFF-DTA model based on a multi-label classification method for recognizing variant characters in ancient books.It includes three modules:enhanced channel attention(ECA)module,hierarchical feature fusion classification head(HFF-Head),and dynamic threshold adjustment(DTA)mechanism,which automatically identify their constituent components and generate multi-label classification results.Experiments show that the model achieves an F1-score of 0.578 3 on the test set,demonstrating a significant improvement over the baseline,and can realize the application function of querying variant characters based on components.However,the recognition performance of the model for complex nested structures of variant characters still needs to be improved.The ECA-HFF-DTA model improves the component recognition effect of variant characters in ancient books,providing ideas for the intelligent sorting and retrieval of variant characters in large-scale ancient literature.关键词
古籍/异体字识别/多标签分类Key words
Ancient Book/Variant Character Recognition/Multi-Label Classification分类
社会科学引用本文复制引用
郑澄璐,王昊,孟序阳,杨思敏..基于改进图像多标签分类的古籍异体字部件识别研究[J].数字图书馆论坛,2025,21(10):12-22,11.基金项目
本研究得到国家自然科学基金面上项目"面向多模态非遗文化数据的特征挖掘及全知识图谱构建研究"(编号:72574098)、南京大学中央高校基本科研业务费专项资金资助项目AI for HASS"AI赋能中国古籍史书知识图谱构建研究"(编号:010814370338)、江苏青年社科英才项目、南京大学仲英青年学者项目资助. (编号:72574098)