肿瘤预防与治疗2026,Vol.39Issue(4):288-297,10.DOI:10.3969/j.issn.1674-0904.2026.04.006
基于临床结构化知识的大视觉语言模型在宫颈癌放疗靶区中的勾画及亚组泛化研究
Delineation and Subgroup Generalization of Large Vision-Language Mod-els Based on Clinical Structured Knowledge for Cervical Cancer Radio-therapy Targets
摘要
Abstract
Objective:To address the semantic gap between imaging features and clinical guidelines,as well as the insuf-ficient generalization ability of models in automatic target delineation for cervical cancer radiotherapy,a large vision-language delineation model incorporating clinical structured knowledge was developed and evaluated for its performance in multi-center and multi-subgroup scenarios.Methods:Radiotherapy planning CT images and clinical data from 478 cervical cancer pa-tients across 3 medical centers were retrospectively collected.A structured knowledge base incorporating tumor stage,treat-ment modality,lymph node metastasis risk,and clinical guideline criteria was constructed.Based on the architecture of the large vision model SAM,the K-SAM model was proposed,which achieves deep alignment between imaging features and guideline semantics via a language encoder and a cross-modal attention mechanism.The performance of K-SAM was evalua-ted in comparison with the SAM and U-Net baseline models.Patients were stratified into four subgroups according to clinical characteristics-pelvic early-stage disease,para-aortic involvement,vaginal or vulvar invasion,and para-aortic plus inguinal involvement-and systematically assessed.Results:The K-SAM model demonstrated superior overall performance compared to the baseline models,achieving a Dice similarity coefficient(DSC)of 0.89±0.03 and a 95%Hausdorff distance of(5.3±0.9)mm.Model performance improved progressively with the integration of structured knowledge,increasing from a DSC of 0.84±0.04(imaging only)to 0.89±0.03(full knowledge integration),with clinical guideline criteria contributing most significantly to the delineation of complex boundaries.In subgroup analyses,K-SAM maintained a stable advantage across all subgroups,with a DSC of 0.91±0.02 in the pelvic early-stage subgroup(PE),0.87±0.03 in the para-aortic subgroup(PA),0.89±0.03 in the vaginal or vulvar involvement subgroup(VV),and 0.86±0.04 in the para-aortic plus inguinal involvement subgroup(PI).Conclusion:By effectively integrating guideline semantics with imaging features,the K-SAM model improves the accuracy and guideline compliance of target delineation,exhibits robust performance in multi-center set-tings and across complex clinical subgroups,thereby providing reliable technical support for standardized and automated pre-cision radiotherapy.关键词
宫颈癌/放射治疗/靶区自动勾画/视觉语言模型/结构化医学知识Key words
Cervical cancer/Radiotherapy/Automatic target delineation/Vision-language model/Structured medical knowledge分类
医药卫生引用本文复制引用
邓佳,黄登殿,张盛元,穆允凤,丁延慧,卫未,李索妮,赵耀林,王国庆..基于临床结构化知识的大视觉语言模型在宫颈癌放疗靶区中的勾画及亚组泛化研究[J].肿瘤预防与治疗,2026,39(4):288-297,10.基金项目
西安市科技计划项目(编号:24YXJ0224) (编号:24YXJ0224)
北京华康公益基金会(编号:EXZL-GX-025) This study was supported by Xi'an Science and Technology Plan Project(No.24YXJ0224)and Beijing Huakang Public Welfare Foundation(No.EXZL-GX-025). (编号:EXZL-GX-025)