A lossless compression algorithm for DNA sequences

Soliman, Taysir H.A.; Gharib, Tarek F.; Abo-alian, Alshaimaa; El Sharkawy, M. A.;

Abstract


The increase of the amount of DNA sequences requires efficient computational algorithms for performing sequence comparison and analysis. Standard compression algorithms are not able to compress DNA sequences because they do not consider special characteristics of DNA sequences (i.e., DNA sequences contain several approximate repeats and complimentary palindromes). Recently, new algorithms have been proposed to compress DNA sequences, often using detection of long approximate repeats. The current work proposes a Lossless Compression Algorithm (LCA), providing a new encoding method. LCA achieves a better compression ratio than that of existing DNA-oriented compression algorithms, when compared to GenCompress, DNACompress, and DNAPack. Copyright © 2009 Inderscience Enterprises Ltd.


Other data

Title A lossless compression algorithm for DNA sequences
Authors Soliman, Taysir H.A.; Gharib, Tarek F.; Abo-alian, Alshaimaa ; El Sharkawy, M. A.
Keywords Approximate repeats;Encoding;LCA;Lossless compression algorithm;Palindrome
Issue Date 1-Jan-2009
Journal International Journal of Bioinformatics Research and Applications 
ISSN 17445485
DOI 10.1504/IJBRA.2009.029040
PubMed ID 19887334
Scopus ID 2-s2.0-70350645184

Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check



Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.