工程科学学报2025,Vol.47Issue(8):1662-1671,10.DOI:10.13374/j.issn2095-9389.2024.10.30.004
面向少样本的知识与数据的跨模态特征融合模型
Integrating knowledge and data:a cross-modal feature fusion model for few-shot problems
摘要
Abstract
The few-shot problem is a common phenomenon in machine learning,particularly in experimental science and medical research.Pure data-driven learning relies heavily on the quality and quantity of data.When data is scarce,the model is prone to overfitting and its generalization ability will decrease.However,most fields have accumulated extensive experience and knowledge.A hybrid approach that combines domain knowledge with data can effectively improve model performance.However,in the context of few-shot problems,achieving effective cross-modal feature fusion of knowledge and data is challenging.This study proposes a knowledge and data cross-modal fusion model(KDFM)to address the few-shot problem.First,numerical modal features are categorized into different feature types and modeled using graphs.For each feature type,edges within the graphs are constructed based on K-means clustering.Then,the different types of numerical features are processed through multichannel graph convolution.These graphs convert numerical modal features into graph-level features,enhancing their expressiveness.Subsequently,domain knowledge features from semantic modalities are represented by a knowledge graph.Key entities and relationships are extracted from professional books and expert experiences.The knowledge graph consists of triples formed by combinations of entities and relationships,enabling the transformation of unstructured text features into graph-level features.Textual domain knowledge and experience are organized and converted into the neural network model.A graph convolutional neural network and attention mechanisms are employed for cross-modal feature fusion between knowledge and data.The input of the graph convolutional network includes different graphs constructed from numerical data,feature vectors obtained from the knowledge graph,and numerical vectors from the data.Based on the number of feature types,multichannel graph convolution is applied to achieve deep feature fusion of knowledge and data.The output is a fused multichannel feature vector,computed using the attention mechanism,which serves as the input feature vector for downstream tasks.The proposed model was validated using two small sample datasets:one for a regression task in the materials field and the other for a classification task in the medical field.Simulation results show that,compared with other data-driven models,the proposed KDFM model exhibits excellent performance across various regression and classification tasks.In the regression task,the model achieved the best results in terms of mean squared error,mean absolute error,and R²,with R²exceeding the suboptimal multilayer perceptron model by over 7%.In the classification task,the model was optimal in five out of seven indicators,with the remaining two indicators being suboptimal.Additionally,multiple ablation experiments were conducted to verify the effectiveness of the proposed model.By removing the modules of the knowledge graph and graph convolutional network from the full model,the study confirmed the effectiveness of both the knowledge modeling and cross-modal fusion mechanism.The proposed model addresses,to some extent,the challenges of weak generalization ability and the integration of knowledge and data modalities in few-shot problems.关键词
少样本/混合驱动/知识图谱/图卷积网络/特征融合Key words
few-shot/hybrid drive/knowledge graph/graph convolutional network/feature fusion分类
信息技术与安全科学引用本文复制引用
柒传江,王成猛,邵立珍,付冬梅,周珂,赵志毅..面向少样本的知识与数据的跨模态特征融合模型[J].工程科学学报,2025,47(8):1662-1671,10.基金项目
国家重点研发计划资助项目(2021YFB3702402) (2021YFB3702402)
中央高校基本科研业务费(FRF-DF-23-003) (FRF-DF-23-003)