林业科学2018,Vol.54Issue(1):32-45,14.DOI:10.11707/j.1001-7488.20180104
基于随机森林算法和SRAP分子标记的桂花品种鉴定方法
A Method of Osmanthus fragrans Cultivars Identification Based on Random Forest Algorithm and SRAP Molecular Markers
摘要
Abstract
[Objective] To solve the problem that Osmanthus fragrans cultivars being hardly identified in nursery stock production and landscape application,this study proposed a classification method based on random forest algorithm and SRAP molecular markers,which can be used for easily,quickly and accurately identifying varieties.[Method] DNA of 45 O.fragrans cultivars were extracted,which were applied to PCR amplification,using 90 SRAP primer pairs.The fragments were examined by Capillary Electrophoresis to screen the primer pairs with high polymorphism level and steady amplification.The amplification data were used to calculate polymorphism information content (PIC),numbers of patterns,numbers of effective patterns,the discriminating power (D),chi-square value of patterns distribution (x2),and pairs of indistinguishable samples (x).The locus data of combination of primer pairs that can discriminate all cultivars were used as training set for construction of classification modes based on random forest algorithm.The models with best classifying ability were selected depending on their generalization ability and classifying quality.[Result] A total of 10 SRAP primer pairs were selected,with mean PIC of 0.26,mean numbers of patterns of 33.9,mean numbers of effective patterns of 26.6,mean D of 0.97,mean x2 of 21.07 and mean x of 28.2.Eight classification models were constructed using 8 combination of 2 prime pairs that can discriminate all cultivars (rf1-rf8).The OOB (out of bag) error rate of these models ranged from 0.004 4-0.013 9.Among of them,rf5 and rf3 had the strongest generalization ability,while rf8 had the weakest.And rf1 had the best classifying quality,rf3,rf4,rf5 and rf7 had better,while rf8 had the worst.[Conclusion] Classification models rfl,rf3,rf4,rf5 and rf7 have the strongest classifying ability,with the combination of SRAP primer pairs of me1/em3 + me9/em6,me4/em5 + me9/em6,me4/em8 + me9/em6,me6/em9 + me9/em6 and me5/em5 + me9/em6,separately.The weaker correlation of selected primer pairs brings the stronger classifying ability of models.The method proposed in this study can be applied to identity O.fragrans cultivars quickly and accurately.关键词
桂花/品种鉴定/分类模型/SRAP分子标记/随机森林算法Key words
Osmanthus fragrans/cultivar identification/classification model/SRAP marker/random forest algorithm分类
农业科技引用本文复制引用
邱帅,沈柏春,李婷婷,郭娟,王霁,孙丽娜,陈徐平,胡绍庆..基于随机森林算法和SRAP分子标记的桂花品种鉴定方法[J].林业科学,2018,54(1):32-45,14.基金项目
浙江省农业新品种选育重大科技专项项目(2016C02056-12). (2016C02056-12)