International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 145 - Number 15 |
Year of Publication: 2016 |
Authors: Siddu P. Algur, Leena I. Sakri |
10.5120/ijca2016910882 |
Siddu P. Algur, Leena I. Sakri . An Efficient Bulk Synchronous Parallelized Scheduler for Bioinformatics Application on Public Cloud. International Journal of Computer Applications. 145, 15 ( Jul 2016), 22-30. DOI=10.5120/ijca2016910882
Genomic sequence alignment of varied species is one of the most sort of applications in bioinformatics. In future bioinformatics technologies are expected to produce genomic data of terabyte. Bioinformatics computation require super computer for sequence alignment computation which involves huge cost. Parallelization technique is a way forward in computing sequence alignment with limited cost and time. Cloud computing and MapReduce framework play an important role in bioinformatics intensive application to achieve parallelization since it provides a consistent performance over time and it also provides good fault tolerant mechanism. The existing gene sequencing methodologies are designed based on Hadoop-MapReduce framework which adopts a serial execution strategy which is an area of concern. This work introduces a Smith-Waterman Alignment on the Bulk synchronous Parallel Map Reduce (SW-BSPMR) cloud platform for bioinformatics gene sequence alignment. This work adopts a widely accepted and accurate SW algorithm for sequence alignment and parallel synchronous scheduler methodology of map and reduce framework process is considered. A customized MapReduce based on Microsoft Azure cloud platform is developed to overcome the issue in Hadoop-MapReduce framework. The experimental study presented in this work proves that the SW-BSPMR can accurately and effectively align bioinformatics genomic sequences of various read length.