CFP last date
20 January 2025
Reseach Article

Dynamic Smith-Waterman Algorithm: A High-Performance Grid-Enabled Application Integrated with Globus, GridSphere Portal Framework and CoG Workflow for Performing Biological Local Sequence Alignment

by Md. Khairul Bashar Chowdhury, Tanzeem Bin Noor, Md. Rubaiyet Sadi, Md. Mahbub-ul-alam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 58 - Number 19
Year of Publication: 2012
Authors: Md. Khairul Bashar Chowdhury, Tanzeem Bin Noor, Md. Rubaiyet Sadi, Md. Mahbub-ul-alam
10.5120/9393-3831

Md. Khairul Bashar Chowdhury, Tanzeem Bin Noor, Md. Rubaiyet Sadi, Md. Mahbub-ul-alam . Dynamic Smith-Waterman Algorithm: A High-Performance Grid-Enabled Application Integrated with Globus, GridSphere Portal Framework and CoG Workflow for Performing Biological Local Sequence Alignment. International Journal of Computer Applications. 58, 19 ( November 2012), 38-45. DOI=10.5120/9393-3831

@article{ 10.5120/9393-3831,
author = { Md. Khairul Bashar Chowdhury, Tanzeem Bin Noor, Md. Rubaiyet Sadi, Md. Mahbub-ul-alam },
title = { Dynamic Smith-Waterman Algorithm: A High-Performance Grid-Enabled Application Integrated with Globus, GridSphere Portal Framework and CoG Workflow for Performing Biological Local Sequence Alignment },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 58 },
number = { 19 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 38-45 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume58/number19/9393-3831/ },
doi = { 10.5120/9393-3831 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:02:58.319487+05:30
%A Md. Khairul Bashar Chowdhury
%A Tanzeem Bin Noor
%A Md. Rubaiyet Sadi
%A Md. Mahbub-ul-alam
%T Dynamic Smith-Waterman Algorithm: A High-Performance Grid-Enabled Application Integrated with Globus, GridSphere Portal Framework and CoG Workflow for Performing Biological Local Sequence Alignment
%J International Journal of Computer Applications
%@ 0975-8887
%V 58
%N 19
%P 38-45
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Smith-Waterman algorithm is one of the most significantly used algorithm and a well known approach to gain information about unknown genes and proteins for biological research. As execution time and accuracy is of great significance as handling large-scale dataset, a more reliable high-throughput and efficient parallelism can be achieved with the adaptation of grid environment. Adapting Smith-Waterman algorithm with the grid environment brings several concerns regarding fault tolerance, variability in resource performance and workload distribution, application availability etc. The work presented here aims at the development of a Dynamic Smith-Waterman algorithm metascheduler that handles all the specifications of job submission on the grid to the end user for local alignment search. Additionally a web based portal using GridSphere portal framework integrated with Globus 4 and Java Commodity Grid Kit is developed that reduces the complexity to the end users in accessing, managing and manipulating the grid resources and applications. The main contribution towards Dynamic Smith-Waterman algorithm is the reduction of the total job execution time up to 52% with accuracy up to 99. 99% and better resource utilization by 40%. In addition, this work can be used as a template for the development of similar applications in future.

References
  1. Ian Foster and Carl Kesselman, "The Grid: Blueprint for a New Computing Infrastructure," Morgan Kaufman Publishers, San Francisco, California, 1998. (references)
  2. Ian Foster, Carl Kesselman, Steven Tuecke, "The anatomy of the Grid: Enabling scalable virtual organizations," International J. Supercomputing Applications, 15(3), 2001.
  3. Gabrielle Allen, Tom Goodale, Michael Russel, Edward Seidel,and John Shalf, "Classifying and enabling grid applications," Concurrency: Pract. Exper. 2000; 00:1-7.
  4. E. Afgan, P. Bangalore, and J. Gray, "A Domain-Specific Language for Describing Grid Applications," in Designing Software-Intensive Systems: Methods and Principles, P. F. Tiako, Ed. , 2007.
  5. A. Kertesz and P. Kacsuk, " A Taxonomy of Grid Resource Brokers," in Distributed and Parallel Systems from Cluster to Grid Computing, 1 ed, P. Kacsuk, T. Fahringer, and Z. Nemeth, Eds. : Springer, 2007, pp. 201-210.
  6. T. F. Smith and M. S. Waterman, "Identification of common molrcular subsequences," J Molecular Biology, vol. 147 pp. 195-197, 1981.
  7. D. Lavenier, "Dedicated hardware for biological sequence comparison," Journal of Universal Computer Science, vol. 2, no. 2, pp. 77–86, 1996.
  8. Y. Yamaguchi, T. Maruyama, and A. Konagaya, "High Speed Homology Search with FPGAs," Pacific Symposium on Biocomputing 7:271-282 (2002).
  9. R. Hughey, "Parallel hardware for sequence comparison and alignment," Comput Appl Biosci. 1996 Dec;12(6):473-9.
  10. D. J. Lipman and W. R. Pearson, "Rapid and sensitive protein similarity searches. ," Science, vol. 227, pp. 1435–1441, March 1985.
  11. S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, "Basic local alignment search tool. ," J MolBiol, vol. 215, pp. 403– 410, October 1990.
  12. A. M. Aji, W. Feng, F. Blagojevic, and D. S. Nikolopoulos, "Cell-SWat: Modeling and Scheduling Wavefront Computations on the Cell Broadband Engine," in Proc. of the ACM International Conference on Computing Frontiers, May 2008.
  13. I. Foster and C. Kesselman, "The Globus toolkit," in The Grid: Blueprint for a New Computing Infrastructure, I. Foster and C. Kesselman, Eds. , San Francisco, California: Morgan Kaufmann, 1999, pp. 259--278.
  14. The Globus Project. http://www. globus. org.
  15. Jason Novotny, Michael Russel, and Oliver Werens. GridSphere: An Advanced Portal Framework. In EUROMICRO '04: Proceedings of the 30thEUROMICRO Conference, pages 412-419, Washington, DC, USA, 2004. IEEE Computer Society.
  16. The GridLab Project. http://www. gridlab. org
  17. Jason Novotny. The Grid Portal Development Kit. Concurrency and Computations: Practice and Experience, 14(13-15):1129-1144,2002.
  18. Gregor von Laszewski, Ian Foster, JarekGawor, and Peter Lane. A Java Commodity Grid Kit. Concurrency and Computations: Practice and Experience, 13(89):643-662, 2001.
  19. H. Rajic, R. Brobst, W. Chan, F. Ferstl, J. Gardiner, A. Haas, B. Nitzberg, and J. Tollefsrud, "Distributed Resource Management Application API (DRMAA) Specification 1. 0 GFD-R-P. 022," Global Grid Forum (GGF) 2004.
  20. E. Huedo, R. S. Montero, and I. M. Llorente, "A Framework for Adaptive Execution on Grids," Journal of Software - Practice and Experience, 34(2004, pp. 631-651.
  21. I. Foster, C. Kesselman, G. Tsudik, and S. Tuecke, "A Security Architecture for Computational Grids," in 5th ACM Conference on Computer and Communication Security Conference, San Francisco, CA, 1998, pp. 83-92.
  22. K. Czajkowski, S. Fitzgerald, I. Foster, and C. Kesselman, "Grid Information Services for Distributed Resource Sharing," in 10 th IEEE Symp. On High Performance Distributed Computing (HPDC), Los Alamitos, CA, 2001, pp. 181-195.
  23. G. von Laszewski and I. Foster, Usage of LDAP in Globus. 1999.
  24. C. Wang and E. J. Lefkowitz, "SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters," BMC Bioinformatics, 5(171), 2004.
  25. The NCBI database. http://www. ncbi. nlm. nih. gov/
  26. The VBRC home. http://www. biovirus. org/
  27. J. Y. -T. Leung, Ed. Handbook of Scheduling: Algorithms, Models, and Performance Analysis, 1st ed. , vol. 1: CRC Press, 2004.
  28. E. Afgan and P. Bangalore, "Assisting Efficient Job Planning and Scheduling in the Grid," in Handbook of Research on Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications, E. Udoh and F. Z. Wang, Eds. : IGI Global, 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Grid Globus GridSphere Java Commodity Grid Kit Scheduling Smith-Waterman algorithm