开放环境下无监督跨模态概念自动提取
Automatic Unsupervised Cross-Modal Concept Extraction in Open Environments
摘要
Abstract
As machine learning models become increasingly complex,the demand for interpretability in their decision-making processes continues to grow.Concept learning has become increasingly important in the field of machine learning as a means to enhance model transparency and comprehensibility.Consequently,the use of concepts to explain the reason-ing processes of black-box models,such as large language models,has also developed.The ability to accurately and auto-matically extract concepts is a crucial step in this interpretability process.To address the issues in existing concept extrac-tion methods,such as reliance on manual annotation,inconsistent granularity,and poor scalability,this paper proposes an automatic cross-modal visual concept extraction framework.This framework employs a multimodal large model-based inquiry method to achieve automatic extraction of target objects in images.The integration of the CLIP model and object detection models establishes a region-matching mechanism to enhance extraction accuracy.Additionally,a large language model based on text segmentation is utilized to extract attributes and relationships of target objects,ensuring consistency between concepts and data.Furthermore,ConceptNet is incorporated to enrich the semantic information of the extracted concepts,improving the flexibility and applicability of framework.Through the implementation of these technologies,this paper demonstrates the practical application of this framework in three different scenarios,showcasing its potential in im-proving the efficiency,accuracy,and scalability of concept extraction.关键词
大语言模型/可解释性/概念提取/自动化/可扩展性Key words
large language model/interpretability/concept extraction/automation/scalability分类
信息技术与安全科学引用本文复制引用
海峻嘉,景丽萍,刘华锋,于剑..开放环境下无监督跨模态概念自动提取[J].计算机科学与探索,2026,20(1):154-168,15.基金项目
国家自然科学基金(62436001,62406019,62176020) (62436001,62406019,62176020)
北京市自然科学基金(4244096) (4244096)
北京交通大学人才基金(2024XKRC075) (2024XKRC075)
教育部创新团队联合基金(8091B042235) (8091B042235)
中央高校基本科研业务费专项资金(2019JBZ110) (2019JBZ110)
北京交通大学轨道交通控制与安全国家重点实验室项目(RCS2023K006).This work was supported by the National Natural Science Foundation of China(62436001,62406019,62176020),the Natural Science Foundation of Beijing(4244096),the Talent Foundation of Beijing Jiaotong University(2024XKRC075),the Joint Foundation of the Ministry of Education for Innovation Team(8091B042235),the Fundamental Research Funds for the Central Universities of China(2019JBZ110),and the Project of State Key Laboratory of Rail Traffic Control and Safety of Beijing Jiaotong University(RCS2023K006). (RCS2023K006)