南京中医药大学学报2026,Vol.42Issue(4):627-636,10.DOI:10.14148/j.issn.1672-0482.2026.0627
基于中医舌图的多层次特征融合中医体质辨识研究
Study on TCM Constitution Identification Based on Multi-Level Feature Fusion of TCM Tongue Images
摘要
Abstract
OBJECTIVE To integrate multimodal features from tongue images and textual descriptions,constructing a hierarchi-cally fused deep learning model for Traditional Chinese Medicine(TCM)constitution identification.METHODS Corresponding tongue diagnosis texts were generated using a large pre-trained language model,forming a multimodal dataset of 945 samples.The proposed TCM-DFM model employed ResNet50 to extract image features and BERT to encode text semantics.A gating mechanism was used in the low-dimensional feature space to achieve visual-semantic adaptive weighting,and cross-modal attention was used in the high-dimensional semantic space to establish pathological feature associations.A dynamic decision fusion mechanism was used to integrate the prediction results of unimodal and multimodal models.On a dataset containing six TCM constitution labels,the model performance was compared with baseline methods such as early fusion and late fusion,and the model performance was evaluated by metrics such as ac-curacy,precision,recall,F1 score,and confusion matrix.RESULTS The TCM-DFM model achieved an accuracy of 84.52%,preci-sion of 82.54%,recall of 84.52%,and F1-score of 83.39%,outperforming all baseline models.In the comparison of multimodal fusion methods,the method of GCAF reached 83.33%accuracy,a 23.81%gain over the best unimodal model.Ablation tests verified the syner-gistic effects of the gating and attention mechanisms.Visualization showed the model concentrated on clinically key tongue regions,align-ing with TCM"inspecting tongue shape"principles.CONCLUSION The proposed model effectively integrates information from tongue images and textual descriptions,overcoming limitations of unimodal analysis and conventional fusion methods.It significantly im-proves the accuracy of constitution classification and underscores the essential role of tongue diagnosis in TCM constitution identification.关键词
中医体质辨识/多模态融合/深度学习/注意力机制/门控机制Key words
TCM constitution identification/multimodal fusion/deep learning/attention mechanism/gating mechanism分类
医药卫生引用本文复制引用
杨磊,王天舒,杨涛,胡孔法..基于中医舌图的多层次特征融合中医体质辨识研究[J].南京中医药大学学报,2026,42(4):627-636,10.基金项目
国家自然科学基金面上项目(82575255) (82575255)
国家科技创新2030重大专项(2025ZD0544900) (2025ZD0544900)
江苏省前沿技术研发计划(BF2025076) (BF2025076)
江苏省中医流派研究院开放课题(JSZYLP2024060) (JSZYLP2024060)
江苏高校"青蓝工程"资助项目(2024) (2024)
江苏省学位与研究生教育教学改革课题(JGKT25_B026) (JGKT25_B026)
江苏省研究生科研创新计划(KYCX25_2268) (KYCX25_2268)