郑州大学学报(理学版)2026,Vol.58Issue(1):51-57,7.DOI:10.13705/j.issn.1671-6841.2024117
融合多级语义的中文医疗短文本分类模型
A Chinese Medical Short Text Classification Model with Multi-level Semantic Fusion
摘要
Abstract
To address the issues of insufficient extraction of key semantic information and decreased ro-bustness in medical short text classification,a text classification model that incorporated multi-level se-mantic information was proposed.Firstly,preliminary semantic features of the text were captured by using a pretrained model.Secondly,critical semantic information was extracted through a capsule network,en-suring that the model could effectively learn the core semantics of short texts.Attention pooling tech-niques were applied to focus on document-level information,thereby enhancing the recognition and un-derstanding of medical terminology and concepts.Finally,an adversarial training strategy was introduced to improve the stability and accuracy of the model when faced with ambiguous expressions or perturbed in-puts.The effectiveness of the model was validated on three medical text classification datasets,CHIP-CTC,KUAKE_QIC and VSQ.The results showed that compared to the existing models,the F1 values of the proposed model increased on the three datasets,significantly enhancing the classification performance of Chinese medical short texts.关键词
中文医疗数据/短文本分类/语义融合/胶囊网络/注意力池化Key words
Chinese medical data/short text classification/semantic fusion/capsule network/attention pooling分类
信息技术与安全科学引用本文复制引用
杨杰,刘纳,郑国风,李晨,道路..融合多级语义的中文医疗短文本分类模型[J].郑州大学学报(理学版),2026,58(1):51-57,7.基金项目
国家自然科学基金项目(62162001) (62162001)
宁夏自然科学基金项目(2021AAC03224) (2021AAC03224)
北方民族大学校级科研项目(2024XYZJK01) (2024XYZJK01)