计算机工程2011,Vol.37Issue(11):184-186,189,4.DOI:10.3969/j.issn.1000.3842.2011.11.063
基于FSA的DNA重复体频率统计算法
Statistical Algorithm for DNA Repeats Frequency Based on Finite State Automaton
摘要
Abstract
The existing statistical algorithms for DNA repeats frequency have some defects on efficiency, flexibility and so on.Based on Finite State Automaton(FSA) for string multiple patterns matching, this paper constructs a DNA subsequence comparative automaton, and optimizes the automaton's state transition based on the idea of Knuth-Morris-Pratt(KMP).By on-line scanning DNA database, it can achieve all DNA subsequence's statistics in the whole database, including the frequency statistics of overlapped or nonoverlapped repeats and the longest common subsequence in a designated DNA sequence set.Experimental results show that the algorithm has advantages of efficiency, precise matching, flexible information acquisition, supporting on-line operation and so on.关键词
有限状态自动机/DNA子序列/重复体频率/频率统计算法/最长公共子序列Key words
Finite State Automaton(FSA)/ DNA subsequence/ repeats frequency/ frequency statistical algorithm/ longest common subsequence分类
信息技术与安全科学引用本文复制引用
陈聪,韩建民,贾泂,辛德东..基于FSA的DNA重复体频率统计算法[J].计算机工程,2011,37(11):184-186,189,4.基金项目
2010年浙江省新苗人才计划基金资助项目(2010R404017) (2010R404017)