| 注册
首页|期刊导航|数字图书馆论坛|基于改进图像多标签分类的古籍异体字部件识别研究

基于改进图像多标签分类的古籍异体字部件识别研究

郑澄璐 王昊 孟序阳 杨思敏

数字图书馆论坛2025,Vol.21Issue(10):12-22,11.
数字图书馆论坛2025,Vol.21Issue(10):12-22,11.DOI:10.3772/j.issn.1673-2286.2025.10.002

基于改进图像多标签分类的古籍异体字部件识别研究

Component Recognition of Variant Characters in Ancient Books Based on Improved Image Multi-Label Classification

郑澄璐 1王昊 1孟序阳 2杨思敏1

作者信息

  • 1. 南京大学信息管理学院,南京 210023||数据工程与知识服务江苏省高校重点实验室(南京大学),南京 210023
  • 2. 中国人民大学信息资源管理学院,北京 100872
  • 折叠

摘要

Abstract

In response to the problem that many variant characters in ancient books are stored in images and cannot be directly retrieved digitally,this study proposes an ECA-HFF-DTA model based on a multi-label classification method for recognizing variant characters in ancient books.It includes three modules:enhanced channel attention(ECA)module,hierarchical feature fusion classification head(HFF-Head),and dynamic threshold adjustment(DTA)mechanism,which automatically identify their constituent components and generate multi-label classification results.Experiments show that the model achieves an F1-score of 0.578 3 on the test set,demonstrating a significant improvement over the baseline,and can realize the application function of querying variant characters based on components.However,the recognition performance of the model for complex nested structures of variant characters still needs to be improved.The ECA-HFF-DTA model improves the component recognition effect of variant characters in ancient books,providing ideas for the intelligent sorting and retrieval of variant characters in large-scale ancient literature.

关键词

古籍/异体字识别/多标签分类

Key words

Ancient Book/Variant Character Recognition/Multi-Label Classification

分类

社会科学

引用本文复制引用

郑澄璐,王昊,孟序阳,杨思敏..基于改进图像多标签分类的古籍异体字部件识别研究[J].数字图书馆论坛,2025,21(10):12-22,11.

基金项目

本研究得到国家自然科学基金面上项目"面向多模态非遗文化数据的特征挖掘及全知识图谱构建研究"(编号:72574098)、南京大学中央高校基本科研业务费专项资金资助项目AI for HASS"AI赋能中国古籍史书知识图谱构建研究"(编号:010814370338)、江苏青年社科英才项目、南京大学仲英青年学者项目资助. (编号:72574098)

数字图书馆论坛

1673-2286

访问量0
|
下载量0
段落导航相关论文