| 注册
首页|期刊导航|电子学报|基于码书索引变换的高通量DNA序列数据压缩算法

基于码书索引变换的高通量DNA序列数据压缩算法

谭丽 孙季丰

电子学报Issue(5):1007-1013,7.
电子学报Issue(5):1007-1013,7.DOI:10.3969/j.issn.0372-2112.2015.05.026

基于码书索引变换的高通量DNA序列数据压缩算法

High-Throughput DNA Sequence Data Compression Method Based on Codebook Index Transformation

谭丽 1孙季丰1

作者信息

  • 1. 华南理工大学电子与信息学院,广东广州 510641
  • 折叠

摘要

Abstract

A novel high-throughput DNA sequence compression method based on codebook index transformation (CITD) is proposed .In CITD ,we used the codebook index transformation (CIT ) model ,to substitute the traditional represatation of codebook indexes by the quaternary values which are expressed by the four standard base characters ,and adopted a simple encoding method to distinguish the replaced and non-replaced substring ,and subsequently determined whether need to use the Burrow Wheeler Transfor-mation (BWT ) according to the value of information entropy ,finally used move to front (MTF ) transformation and Huffman en-tropy coding to compress the data .Experimental results on several sequencing data sets demonstrate better performance of CITD than the high-throughput DNA sequence compression algorithms cited in this paper ,in most cases .

关键词

高通量DNA序列/码书索引变换模型/块排序压缩变换/前移编码/信息熵/数据压缩算法

Key words

high-throughput DNA sequence/codebook index transformation (CIT )model/burrow wheeler transfarmation (BWT)/move to front(MTF)/information entropy/data compression algorithm

分类

信息技术与安全科学

引用本文复制引用

谭丽,孙季丰..基于码书索引变换的高通量DNA序列数据压缩算法[J].电子学报,2015,(5):1007-1013,7.

基金项目

国家自然科学基金青年科学基金(No .61202292);广东省自然科学基金 ()

电子学报

OA北大核心CSCDCSTPCD

0372-2112

访问量3
|
下载量0
段落导航相关论文