| 注册
首页|期刊导航|计算机科学与探索|开放环境下无监督跨模态概念自动提取

开放环境下无监督跨模态概念自动提取

海峻嘉 景丽萍 刘华锋 于剑

计算机科学与探索2026,Vol.20Issue(1):154-168,15.
计算机科学与探索2026,Vol.20Issue(1):154-168,15.DOI:10.3778/j.issn.1673-9418.2505051

开放环境下无监督跨模态概念自动提取

Automatic Unsupervised Cross-Modal Concept Extraction in Open Environments

海峻嘉 1景丽萍 1刘华锋 2于剑2

作者信息

  • 1. 先进轨道交通自主运行全国重点实验室,北京 100044||交通数据挖掘与具身智能北京市重点实验室,北京 100044||北京交通大学计算机科学与技术学院,北京 100044
  • 2. 交通数据挖掘与具身智能北京市重点实验室,北京 100044||北京交通大学计算机科学与技术学院,北京 100044
  • 折叠

摘要

Abstract

As machine learning models become increasingly complex,the demand for interpretability in their decision-making processes continues to grow.Concept learning has become increasingly important in the field of machine learning as a means to enhance model transparency and comprehensibility.Consequently,the use of concepts to explain the reason-ing processes of black-box models,such as large language models,has also developed.The ability to accurately and auto-matically extract concepts is a crucial step in this interpretability process.To address the issues in existing concept extrac-tion methods,such as reliance on manual annotation,inconsistent granularity,and poor scalability,this paper proposes an automatic cross-modal visual concept extraction framework.This framework employs a multimodal large model-based inquiry method to achieve automatic extraction of target objects in images.The integration of the CLIP model and object detection models establishes a region-matching mechanism to enhance extraction accuracy.Additionally,a large language model based on text segmentation is utilized to extract attributes and relationships of target objects,ensuring consistency between concepts and data.Furthermore,ConceptNet is incorporated to enrich the semantic information of the extracted concepts,improving the flexibility and applicability of framework.Through the implementation of these technologies,this paper demonstrates the practical application of this framework in three different scenarios,showcasing its potential in im-proving the efficiency,accuracy,and scalability of concept extraction.

关键词

大语言模型/可解释性/概念提取/自动化/可扩展性

Key words

large language model/interpretability/concept extraction/automation/scalability

分类

信息技术与安全科学

引用本文复制引用

海峻嘉,景丽萍,刘华锋,于剑..开放环境下无监督跨模态概念自动提取[J].计算机科学与探索,2026,20(1):154-168,15.

基金项目

国家自然科学基金(62436001,62406019,62176020) (62436001,62406019,62176020)

北京市自然科学基金(4244096) (4244096)

北京交通大学人才基金(2024XKRC075) (2024XKRC075)

教育部创新团队联合基金(8091B042235) (8091B042235)

中央高校基本科研业务费专项资金(2019JBZ110) (2019JBZ110)

北京交通大学轨道交通控制与安全国家重点实验室项目(RCS2023K006).This work was supported by the National Natural Science Foundation of China(62436001,62406019,62176020),the Natural Science Foundation of Beijing(4244096),the Talent Foundation of Beijing Jiaotong University(2024XKRC075),the Joint Foundation of the Ministry of Education for Innovation Team(8091B042235),the Fundamental Research Funds for the Central Universities of China(2019JBZ110),and the Project of State Key Laboratory of Rail Traffic Control and Safety of Beijing Jiaotong University(RCS2023K006). (RCS2023K006)

计算机科学与探索

1673-9418

访问量0
|
下载量0
段落导航相关论文