中南大学学报(自然科学版)2016,Vol.47Issue(5):1580-1587,8.DOI:10.11817/j.issn.1672-7207.2016.05.018
一种基于深度学习的异构多模态目标识别方法
Heterogeneous multimodal object recognition method based on deep learning
摘要
Abstract
The heterogeneous multimodal object recognition method was proposed based on deep learning. Firstly, based on the video and audio co-existing feature of media data, a heterogeneous multimodal structure was constructed to incorporate the convolutional neural network(CNN) and the restricted boltzmann machine(RBM). The audio and video information were processed respectively, generating the share characteristic representation by using the canonical correlation analysis(CCA). Then the temporal coherence of video frame was utilized to improve the recognizing accuracy further. The experiments were implemented based on the standard audio & face library and the actual movie video fragments. The results show thatforboth the two kinds ofvideo sources, the proposed method improves the accuracy of target recognition significantly.关键词
目标识别/深度学习/卷积神经网络/限制玻尔兹曼机/典型关联分析Key words
object recognition/deep learning/restricted boltzmann machine/convolutional neural network/canonical co rrelation analysis分类
信息技术与安全科学引用本文复制引用
孟飞,胡超,刘伟荣..一种基于深度学习的异构多模态目标识别方法[J].中南大学学报(自然科学版),2016,47(5):1580-1587,8.基金项目
湖南省教育科学“十二五”规划重点项目(XJK014AJC001);国家自然科学基金资助项目(61379111,61003233,61202342);教育部-中国移动科研基金资助项目(MCM20121031)(Project(XJK014AJC001) supported by the HunanProvincialEducationScienceKey Foundationduring12thFive-YearPlan (XJK014AJC001)
Projects(61379111,61003233,61202342) supported by the National Natural Science Foundation of China (61379111,61003233,61202342)
Project(MCM20121031) supported by theScienceFund ofEducationDepartment-ChinaMobile) (MCM20121031)