Deep Learning for Taxonomic Classification of Biological Bacterial Sequences

Helaly, Marwah A.; Rady, Sherine; Aref, Mostafa M.;

Abstract


Biological sequence classification is a key task in Bioinformatics. For research labs today, the classification of unknown biological sequences is essential for facilitating the identification, grouping and study of organisms and their evolution. This work focuses on the task of taxonomic classification of bacterial species into their hierarchical taxonomic ranks. Barcode sequences of the 16S rRNA dataset—which are known for their relatively short sequence lengths and highly discriminative characteristics—are used for classification. Several sequence representations and CNN architecture combinations are considered, each tested with the aim of learning and finding the best approaches for efficient and effective taxonomic classification. Sequence representations include k-mer based representations, integer-encoding, one-hot encoding and the usage of embedding layers in the CNN. Experimental results and comparisons have shown that representations which hold some sequential information about a sequence perform much better than a raw representation. A maximum accuracy of 91.7% was achieved with a deeper CNN when the employed sequence representation was more representative of the sequence. However with less representative representations a wide and shallow network was able to efficiently extract information and provide a reasonable accuracy of 90.6%.


Other data

Title Deep Learning for Taxonomic Classification of Biological Bacterial Sequences
Authors Helaly, Marwah A.; Rady, Sherine ; Aref, Mostafa M.
Keywords Biological sequences;Classification;Convolutional neural networks;Deep learning;DNA;Feature representation;RNA
Issue Date 1-Jan-2021
Journal Studies in Big Data 
Start page 393
End page 413
ISBN 978-3-030-59337-7
978-3-030-59338-4
ISSN 21976503
DOI 10.1007/978-3-030-59338-4_20
Scopus ID 2-s2.0-85132933090

Recommend this item

Similar Items from Core Recommender Database

Google ScholarTM

Check

Citations 3 in scopus


Items in Ain Shams Scholar are protected by copyright, with all rights reserved, unless otherwise indicated.