An Efficient Algorithm to Identify DNA Motifs

Abbass, Mostafa M.; Bahig, Hazem;

Abstract


We consider the problem of identifying motifs that abstracts the task of finding short conserved sites in genomic DNA. The planted (l, d)-motif problem, PMP, is the mathematical abstraction of this problem, which consists of finding a substring of length l that occurs in each s in a set of input sequences S = {s , s , . . ., s } with at most d substitutions. Our propose algorithm combines the voting algorithm and pattern matching algorithm to find exact motifs. The combined algorithm is achieved by running the voting algorithm on t′ sequences, t′ < t. After that we use the pattern matching on the output of the voting algorithm and the reminder sequences, t - t′. Two values of t′ are calculated. The first value of t′ makes the running time of our proposed algorithm less than the running time of voting algorithm. The second value of t′ makes the running time of our proposed algorithm is minimal. We show that our proposed algorithm is faster than the voting algorithm by testing both algorithms on simulated data from (9, d ≤ 2) to (19, d ≤ 7). Finally, we test the performance of the combined algorithm on realistic biological data. © 2013 Springer Basel. i 1 2 t


Other data

Title An Efficient Algorithm to Identify DNA Motifs
Authors Abbass, Mostafa M.; Bahig, Hazem 
Keywords DNA motifs;Exact algorithms;Pattern matching;Planted (l, d)-motif;Transcription factor binding sites
Issue Date 1-Dec-2013
Publisher SPRINGER BASEL AG
Journal Mathematics in Computer Science 
Volume 7
Start page 387
End page 399
ISSN 16618270
DOI 10.1007/s11786-013-0165-6
Scopus ID 2-s2.0-84895910996
Web of science ID WOS:000214008300002

Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

Citations 5 in scopus


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.