基于多视图矩阵补全的蛋白受体功能预测OA北大核心CSTPCD
Predicting functions of protein receptors through multi-view matrix completion
蛋白受体是细胞信号转导的重要组成部分,也是人类最重要的药物靶点,其中G蛋白偶联受体(G Protein Coupled Receptors,GPCRs)占绝大部分,目前市场上大约34%的药物都以GPCRs作为靶点.准确地注释GPCR蛋白的生物学功能对于理解它们涉及的生理过程及靶向药物发现至关重要,其中基因本体学(Gene Ontology,GO)是描述蛋白质功能最常用的方式,GPCR蛋白和GO都包含多个视图信息,有效利用这些信息可有效提升蛋白质功能的预测性能.因此,提出一种基于多视图的归纳矩阵补全方法MVIMC(Multi-View Inductive Matrix Completion)来预测GPCR蛋白的GO生物学功能.MVIMC有效利用了 GPCR蛋白和GO标记视图信息,其中GPCR包含文本信息和结构域信息,GO包含文本信息.实验结果表明,MVIMC在分子功能和生物过程两方面的预测概率分别达到68%和69%,优于目前最好的矩阵补全方法以及CAFA蛋白质功能预测比赛中的常用方法.
Protein receptors are important component of cellular signal transduction and the most important drug targets in humans,with G Protein Coupled Receptors(GPCRs)accounting for the vast majority.GPCRs involve the most important drug targets in humans,accounting for about 34%of drugs on the market.Accurately annotating biological functions of GPCR proteins is vital to understand physiological processes involved and for targeted drug discovery,with Gene Ontology(GO)being the most commonly used way to describe protein function.Both GPCR proteins and GO contain multiple view information,and effectively utilizing this information improves protein function prediction performance.Therefore,this paper proposes a multi-view inductive matrix completion method MVIMC(Multi-View Inductive Matrix Completion)for predicting GO functions of GPCR proteins.MVIMC effectively utilizes GPCR protein and GO label view information,with GPCR containing textual and domain information,and GO containing textual information.Experimental results show that MVIMC achieves prediction probabilities of 68%and 69%for molecular function and biological process,respectively,which are better than the best current matrix completion methods and common methods in the CAFA protein function prediction competition.
黄玮翔;丁季;刘夏栩;殷勤;兰闯闯;吴建盛
南京邮电大学地理与生物信息学院,南京,210023南京邮电大学通信与信息工程学院,南京,210023
计算机与自动化
G蛋白偶联受体基因本体矩阵补全多视图学习
G Protein-Coupled Receptors(GPCRs)Gene Ontologyinductive matrix completionmulti-view learning
《南京大学学报(自然科学版)》 2024 (001)
1-11 / 11
国家自然科学基金(61872198,61971216),江苏省科技厅基础研究计划(BK20201378)
评论