现代电子技术Issue(8):50-53,4.
基于多视角特征融合与随机森林的蛋白质结晶预测
Protein crystallization prediction based on fusion of multi-view features and random forest
摘要
Abstract
The X⁃ray crystallography analysis is one of the important methods to measure protein structure,by which 80%~90% protein of the known crystalloid structures in the international protein data bank(PDB)has been obtained. However,not all the proteins used for determining structures are crystallizable,which will lead to a low success rate of crystallization projects and a serious waste of resources to measure those non⁃crystallizable protein. Hence,it is important to develop an accurate and ef⁃fective method for predicting whether a protein will crystallize. In this study,a new protein crystallization prediction method to combine the protein physicochemical characteristic,serial information and evolutionary information is proposed,which extracts the protein physicochemical properties,pseudo amino acids composition(PseAAC)and pseudo position specific scoring matrix (PsePSSM) at different visual angle respectively. The random forest is taken as classifier predict protein crystallization of the combined Properties. Experimental results on benchmark dataset over cross⁃validation test and independent validation test show that the proposed method has perfect performance.关键词
蛋白质结晶/伪氨基酸组成/位置特异性得分矩阵/随机森林Key words
protein crystallization/pseudo amino acids composition/position specific scoring matrix/random forest分类
信息技术与安全科学引用本文复制引用
李强,郑宇杰..基于多视角特征融合与随机森林的蛋白质结晶预测[J].现代电子技术,2015,(8):50-53,4.基金项目
中央高校基本科研业务费专项资金 ()