烟草科技2017,Vol.50Issue(10):1-7,7.
基于烟草基因组重测序数据的SNP提取软件组合比较
Assessment of SNP-calling pipelines using tobacco genome resequencing data
摘要
Abstract
To select a suitable software pipeline for analyzing large-scale resequencing data of tobacco genome, nine software pipelines were compared. Three standalone software packages including NGS QC Toolkit, Trimmomatic and ngsShoRT were used for filtering K326 genome sequencing data. The quality filtered reads were mapped to Hongda Reference Genome through two sequence aligners BWA and Bowtie2. Then, SAMtools, a variant calling tool, was used to identify SNPs, and GATK was used to analyze the results generated by BWA. Finally, a total of nine independent VCF files containing SNPs and InDels were obtained. The results showed that the outputs analyzed by the nine software pipelines differed significantly, and the exact probabilities of the nine SNPs-calling pipelines ranged from 55% to 71%. The pipeline of Trimmomatic_BWA_SAMtools featured higher efficiency, easier operation and higher precision, it was therefore considered suitable for data reprocessing of large-scale genomic resequencing data.关键词
普通烟草/SNP提取软件/基因组/重测序Key words
Tobacco/SNP calling pipline/Genome/Resequencing分类
生物科学引用本文复制引用
余世洲,曹培健,李泽锋,林世锋,张洁,郭玉双,余婧,任学良..基于烟草基因组重测序数据的SNP提取软件组合比较[J].烟草科技,2017,50(10):1-7,7.基金项目
中国烟草总公司科技重大专项"烟草代表性品种资源基因差异和性状关联分析"[110201201004(JY-04)] (JY-04)
中国烟草总公司重点项目"品种抗旱鉴定体系研究及抗旱烤烟品种选育"(110201302004). (110201302004)