厦门大学学报(自然科学版)2012,Vol.51Issue(1):139-143,5.
大规模GO注释的生物信息学流程
Bioinformatics Procedure of Large-scale GO Annotation
摘要
Abstract
With the fast development of next-generation sequencing technologies,a large number of biological data will provide tremendous sequence resources to biologists in gene exploitation. An important task on data mining is to annotate genes with functions, and the most important method is Gene Ontology (GO) annotation. This research formed the procedure of large-scale GO annotation pipeline for EST sequences,utilizing bioinformatics methodologies and software tools. This procedure encompasses different software like BLAST, B2g4pipe and Wego,together with Swissprot,Interpro or Nr protein databases. Users can put EST sequences with FAS-TA format through this system and ultimately gain visualized GO distribution statistics diagrams, which demonstrate the situations of the genes involved in different processes. In order to test and verify the preciseness of LSGAP, the EST sequences of eastern oyster published in 2007 were gone through this pipeline,and the results demonstrated that LSGAP procedure was quite accurate and efficient. Compared with other GO annotation software such as Blast2go (Graphical User Interface) and GoBlast,LSGAP procedure has many advantages:running BLAST software locally,without downloading many GO relative databases and consuming less time. All of the results demonstrated that LSGAP is an efficient tool for researchers to do data mining.关键词
LSGAP/GO注释/基因功能/生物信息学Key words
LSGAP/GO annotation/gene function/bioinformatics分类
生物科学引用本文复制引用
黄子夏,柯才焕,陈军..大规模GO注释的生物信息学流程[J].厦门大学学报(自然科学版),2012,51(1):139-143,5.基金项目
国家重点基础研究发展计划(973)项目(2010CB126403) (973)
国家自然科学基金项目(40976093) (40976093)
福建省青年科技人才创新项目(2008F3098) (2008F3098)