| 注册
首页|期刊导航|计算机工程与应用|基于数据划分和集成的方法预测信号肽

基于数据划分和集成的方法预测信号肽

王怡 郭躬德 孔祥增

计算机工程与应用2012,Vol.48Issue(36):238-244,7.
计算机工程与应用2012,Vol.48Issue(36):238-244,7.DOI:10.3778/j.issn.1002-8331.1106-0045

基于数据划分和集成的方法预测信号肽

Method based on data dividing and integration for predicting signal peptides

王怡 1郭躬德 2孔祥增1

作者信息

  • 1. 福建师范大学数学与计算机科学学院,福州350007
  • 2. 福建师范大学网络安全与密码技术重点实验室,福州350007
  • 折叠

摘要

Abstract

As the length of signal peptide sequence is different and the composition of amino acid is diversified, most of existing methods in literature for signal peptides prediction employ scaling windows to deal with these problems, which lead to potential loss of useful information and imbalanced data problem. In order to improve the prediction performance of the class with minority samples, data preprocessing is used before employing traditional probabilistic neural networks to build classifiers: the class with majority samples is divided into several groups, and then several data subsets are respectively constituted by combining each group with minority samples, which are used to train probabilistic neural network classifiers. The ensemble system finally combines results through ballot from a series of classifiers worked on two different coding of proteins sequences. The experiments carried out on the popular Neilsen dataset show the effectiveness of the proposed algorithm.

关键词

信号肽预测/不平衡数据集/聚类划分/概率神经网络/多分类器融合

Key words

signal peptides prediction/ imbalanced data sets/ clustering dividing/ probabilistic neural networks/multiple classifiers combination

分类

信息技术与安全科学

引用本文复制引用

王怡,郭躬德,孔祥增..基于数据划分和集成的方法预测信号肽[J].计算机工程与应用,2012,48(36):238-244,7.

基金项目

国家自然科学基金(No.61070062) (No.61070062)

福建高校产学合作科技重大项目(No.2010H6007). (No.2010H6007)

计算机工程与应用

OACSCDCSTPCD

1002-8331

访问量0
|
下载量0
段落导航相关论文