CFP last date
20 January 2025
Reseach Article

Analysis of Biological Sequence Search Performance in NoSQL Database

by Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 42
Year of Publication: 2020
Authors: Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi
10.5120/ijca2020920416

Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi . Analysis of Biological Sequence Search Performance in NoSQL Database. International Journal of Computer Applications. 176, 42 ( Jul 2020), 1-6. DOI=10.5120/ijca2020920416

@article{ 10.5120/ijca2020920416,
author = { Quezia N. Flach, Arthur F. Lorenzon, Marcelo C. Luizelli, Fabio D. Rossi },
title = { Analysis of Biological Sequence Search Performance in NoSQL Database },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2020 },
volume = { 176 },
number = { 42 },
month = { Jul },
year = { 2020 },
issn = { 0975-8887 },
pages = { 1-6 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number42/31480-2020920416/ },
doi = { 10.5120/ijca2020920416 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:41:05.874593+05:30
%A Quezia N. Flach
%A Arthur F. Lorenzon
%A Marcelo C. Luizelli
%A Fabio D. Rossi
%T Analysis of Biological Sequence Search Performance in NoSQL Database
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 42
%P 1-6
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data generated by research involving genomic data needs high processing power due to the large volume of data generated by these surveys. The development of computational tools softened the impacts of a large amount of data caused by such research. Still, the processing and storage of genomic data is a challenge. In this work, we intend to evaluate the performance of a distributed NoSQL database and possibly present a more feasible performance solution by analyzing the behavior of the NoSQL DynamoDB database when carrying out a genomic search. The results showed that NoSQL databases have superior scalability and performance to relational databases, and perform very closely with high-performance applications over multiprocessing environments. NoSQL databases consist of a new model that gives storage for large volumes of data and processing capacities far superior to relational databases due to the arrangement of data coming from big data and data science area. The use of NoSQL approaches over distributed data makes them more flexible concerning performance, as they can grow infinitely above resources according to the database demand. It makes them an ideal tool to perform the metagenomic search.

References
  1. S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403–410, 1990.
  2. David Axmark and Michael Widenius. Mysql introduction. Linux J., 1999(67es), November 1999.
  3. Simone Brunozzi. Big data and nosql with amazon dynamodb. In Proceedings of the 2012 Workshop on Management of Big Data Systems, MBDS ’12, pages 41–42, New York, NY, USA, 2012. ACM.
  4. Ian Buck. Gpu computing with nvidia cuda. In ACM SIGGRAPH 2007 Courses, SIGGRAPH ’07, New York, NY, USA, 2007. ACM.
  5. Hsinchun Chen, Roger HL Chiang, and Veda C Storey. Business intelligence and analytics: From big data to big impact. MIS quarterly, 36(4), 2012.
  6. Min Chen, Shiwen Mao, and Yunhao Liu. Big data: A survey. Mobile networks and applications, 19(2):171–209, 2014.
  7. R Dias, Miguel G Xavier, Fabio D Rossi, Marcelo Veiga Neves, TAP Lange, A Giongo, C´esar Augusto Fonticielha De Rose, and EW Triplett. Mpi-blastn and ncbi-taxcollector: improving metagenomic analysis with high performance classification and wide taxonomic attachment. Journal of bioinformatics and computational biology, 12(03):1450013, 2014.
  8. Jayavardhana Gubbi, Rajkumar Buyya, Slaven Marusic, and Marimuthu Palaniswami. Internet of things (iot): A vision, architectural elements, and future directions. Future generation computer systems, 29(7):1645–1660, 2013.
  9. Neal Leavitt. Will nosql databases live up to their promise? Computer, 43(2):12–14, 2010.
  10. Yishan Li and Sathiamoorthy Manoharan. A performance comparison of sql and nosql databases. In 2013 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM), pages 15–19. IEEE, 2013.
  11. Iasmini Virg´inia Oliveira Lima. Replicac¸ ˜ao de dados em workflows de bioinform´atica usando os bancos de dados nosql.
  12. H. Lin, X. Ma, W. Feng, and N. F. Samatova. Coordinating computation and i/o in massively parallel sequence search. IEEE Transactions on Parallel and Distributed Systems, 22(4):529–543, April 2011.
  13. Bernadette Farias L´oscio, H´elio Rodrigues de OLIVEIRA, and Jonas C´esar de Sousa PONTES. Nosql no desenvolvimento de aplicac¸ ˜oes web colaborativas. VIII Simp´osio Brasileiro de Sistemas Colaborativos, 10(1):11, 2011.
  14. Diana Marco. Metagenomics: Theory, methods and applications. Horizon Scientific Press, 2010.
  15. ABM Moniruzzaman and Syed Akhter Hossain. Nosql database: New era of databases for big data analyticsclassification, characteristics and comparison. arXiv preprint arXiv:1307.0191, 2013.
  16. Jason H. Moore. Bioinformatics. In Proceedings of the 9th Annual Conference Companion on Genetic and Evolutionary Computation, GECCO ’07, pages 3435–3457, New York, NY, USA, 2007. ACM.
  17. Ariel Ortiz. Architecting serverless microservices on the cloud with aws. In Proceedings of the 50th ACM Technical Symposium on Computer Science Education, SIGCSE ’19, pages 1240–1240, New York, NY, USA, 2019. ACM.
  18. Giuseppe Paterno. Nosql tutorial: A comprehensive look at the nosql database. Linux J., 1999(67es), November 1999.
  19. Felipe Pezoa, Juan L. Reutter, Fernando Suarez, Mart´in Ugarte, and Domagoj Vrgo?c. Foundations of json schema. In Proceedings of the 25th International Conference on World Wide Web, WWW ’16, pages 263–273, Republic and Canton of Geneva, Switzerland, 2016. InternationalWorldWideWeb Conferences Steering Committee.
  20. Jaroslav Pokorny. Nosql databases: a step to database scalability in web environment. International Journal of Web Information Systems, 9(1):69–82, 2013.
  21. Sharvari Rautmare and DM Bhalerao. Mysql and nosql database comparison for iot application. In 2016 IEEE International Conference on Advances in Computer Applications (ICACA), pages 235–238. IEEE, 2016.
  22. Fabr´icio R Santos and Jos´e Miguel Ortega. Bioinform´atica aplicada `a genˆomica. Melhoramento Genˆomico, Minas Gerais: UFV, pages 93–98, 2003.
  23. CORPORATE The MPI Forum. Mpi: A message passing interface. In Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, Supercomputing ’93, pages 878–883, New York, NY, USA, 1993. ACM.
Index Terms

Computer Science
Information Sciences

Keywords

BLAST NCBI Sequence alignment Metagenomics Taxonomy assignment