现代情报2025,Vol.45Issue(5):15-23,98,10.DOI:10.3969/j.issn.1008-0821.2025.05.003
基于多任务联合学习的长白山民间文学实体抽取方法研究
Multi-Task Learning for Changbai Mountain Folk Literature Entity Extraction
摘要
Abstract
[Purpose/Significance]Named entity recognition of folk literature text data helps deepen the description and presentation of folk literature materials,laying a solid foundation for building a complete knowledge system of Changbai Mountain's intangible heritage.[Method/Process]This study proposed a Changbai Mountain intangible heritage folk litera-ture entity extraction model based on BERT-BiGRU-MHA-CRF.Bidirectional Gated Recurrent Unit(BiGRU)was intro-duced to better handle the long sequence dependence of entities in sentences and solve the gradient vanishing problem.Then,the Multi-head Attention(MHA)mechanism was added to enhance the attention weight allocation for key entities,thus obtaining better entity recognition results.[Result/Conclusion]Compared with the mainstream multi-task joint learn-ing benchmark models BERT-CRF and BERT-BiLSTM-CRF,the proposed model achieves the highest accuracy in named entity recognition of folk literature,with a precision rate of 86.76%.This study preliminarily realizes accurate entity rec-ognition of folk literature text,which is conducive to in-depth analysis and knowledge mining of folk literature materials and helps protect and inherit the cultural memory of Changbai Mountain.关键词
数字人文/多任务联合学习/预训练模型/长白山文化/民间文学/实体识别Key words
digital humanities/Multi-task learning/pre-trained model/Changbai Mountain culture/folk litera-ture/named entity recognition分类
计算机与自动化引用本文复制引用
张卫东,陈希鹏,李心怡,李奉芮..基于多任务联合学习的长白山民间文学实体抽取方法研究[J].现代情报,2025,45(5):15-23,98,10.基金项目
国家社会科学基金项目"面向数字人文的档案文献数据组织与知识发现研究"(项目编号:19BTQ094). (项目编号:19BTQ094)