计算机应用研究2023,Vol.40Issue(12):3572-3577,6.DOI:10.19734/j.issn.1001-3695.2023.04.0166
基于协同进化信息和深度学习的蛋白质功能预测
Protein function prediction based on coevolutionary information and deep learning
摘要
Abstract
The function of protein is crucial for understanding the mechanisms of cellular and biological activities,as well as for studying the mechanisms of diseases.Traditional experimental and sequence alignment methods are insufficient to support large-scale protein functional annotation when in the face of the rapid growth of sequence databases.For this situation,this pa-per proposed EGNet model,which utilized the protein pre-training language model ESM2 and one-hot encoding to obtain the protein sequence encoding.The model integrated the coevolutionary information between residues,including PI and SPI,through sequence self-attention and physical calculations.Subsequently,the two types of coevolutionary information and the se-quence encoding used in inputs for a multi-layered cascaded graph convolutional network to learn the node features of the se-quence encoding and achieve end-to-end protein function prediction.Compared with earlier methods,EGNet achieves better performance on the EC category labels in the ENZYME database,which reaches 0.89 in the F-score and 0.91 in the AUPR.The results indicate that EGNet can achieve good performance by using only a single sequence to predict protein function,pro-viding a rapid and effective method for protein function annotation.关键词
蛋白质功能/深度学习/协同进化信息/语言模型/图卷积神经网络Key words
protein function/deep learning/coevolutionary information/language model/graph convolutional neural net-work分类
信息技术与安全科学引用本文复制引用
王金雷,丁学明,秦琪琪,彭博雅..基于协同进化信息和深度学习的蛋白质功能预测[J].计算机应用研究,2023,40(12):3572-3577,6.基金项目
国家自然科学基金资助项目(11502145) (11502145)