软件导刊2025,Vol.24Issue(8):43-48,6.DOI:10.11907/rjdk.241530
基于多模态学习的点云分类网络
Point Cloud Classification Network Based on Multimodal Learning
摘要
Abstract
A multi-modal learning based point cloud classification network model MPC-CLIP is proposed to address the problem of large pa-rameter quantities and low training efficiency in 3D point cloud classification networks due to the disorder,sparsity,and complexity of point cloud data.Firstly,project the original point cloud data into a two-dimensional depth map from multiple perspectives;Secondly,using the text encoder and image encoder trained in two-dimensional CLIP,the text description features of the point cloud and the image features of the projected image are extracted respectively,and the correspondence between text features and image features is explored from the trained mod-el;Finally,each image feature is weighted and summed with all text features,and cosine similarity is calculated to obtain the final zero sam-ple classification result.Conduct zero sample classification and ablation experiments using the classic point cloud dataset ModelNet40,and fuse the MPC-CLIP model with multiple classic 3D point cloud classification models.The experimental results show that after integrating the proposed model with other 3D point cloud classification network models,the average classification accuracy improved by 0.9%to 1.9%,and the overall classification accuracy improved by 0.8%to 1.7%,demonstrating its effectiveness and robustness.关键词
点云处理/多模态学习/CLIP/零样本分类Key words
point cloud processing/multimodal learning/CLIP/zero-shot classification分类
信息技术与安全科学引用本文复制引用
王子澳,周国鹏,张建权,刘煌坤..基于多模态学习的点云分类网络[J].软件导刊,2025,24(8):43-48,6.基金项目
湖北省科技创新重大专项(2023EGA023,2022BBA026,2021BGD022,2020BGC028) (2023EGA023,2022BBA026,2021BGD022,2020BGC028)