Reseach Article

Analysis of Biological Sequence Search Performance in NoSQL Database

by Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 42
Year of Publication: 2020
Authors: Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi

Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi . Analysis of Biological Sequence Search Performance in NoSQL Database. International Journal of Computer Applications. 176, 42 ( Jul 2020), 1-6. DOI=10.5120/ijca2020920416

Data generated by research involving genomic data needs high processing power due to the large volume of data generated by these surveys. The development of computational tools softened the impacts of a large amount of data caused by such research. Still, the processing and storage of genomic data is a challenge. In this work, we intend to evaluate the performance of a distributed NoSQL database and possibly present a more feasible performance solution by analyzing the behavior of the NoSQL DynamoDB database when carrying out a genomic search. The results showed that NoSQL databases have superior scalability and performance to relational databases, and perform very closely with high-performance applications over multiprocessing environments. NoSQL databases consist of a new model that gives storage for large volumes of data and processing capacities far superior to relational databases due to the arrangement of data coming from big data and data science area. The use of NoSQL approaches over distributed data makes them more flexible concerning performance, as they can grow infinitely above resources according to the database demand. It makes them an ideal tool to perform the metagenomic search.

Index Terms

Computer Science
Information Sciences


BLAST NCBI Sequence alignment Metagenomics Taxonomy assignment