CFP last date
20 December 2024
Reseach Article

PTM-MatAlign: A Fast GPU-based Algorithm for Pairwise Protein Structure Alignment

by Nada M. A. Mohammed, Hala M. Ebeid, Mostafa G. M. Mostafa, Mahmoud E. A. Gadallah
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 19
Year of Publication: 2020
Authors: Nada M. A. Mohammed, Hala M. Ebeid, Mostafa G. M. Mostafa, Mahmoud E. A. Gadallah
10.5120/ijca2020920147

Nada M. A. Mohammed, Hala M. Ebeid, Mostafa G. M. Mostafa, Mahmoud E. A. Gadallah . PTM-MatAlign: A Fast GPU-based Algorithm for Pairwise Protein Structure Alignment. International Journal of Computer Applications. 176, 19 ( May 2020), 31-40. DOI=10.5120/ijca2020920147

@article{ 10.5120/ijca2020920147,
author = { Nada M. A. Mohammed, Hala M. Ebeid, Mostafa G. M. Mostafa, Mahmoud E. A. Gadallah },
title = { PTM-MatAlign: A Fast GPU-based Algorithm for Pairwise Protein Structure Alignment },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 19 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 31-40 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number19/31309-2020920147/ },
doi = { 10.5120/ijca2020920147 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:42:58.299569+05:30
%A Nada M. A. Mohammed
%A Hala M. Ebeid
%A Mostafa G. M. Mostafa
%A Mahmoud E. A. Gadallah
%T PTM-MatAlign: A Fast GPU-based Algorithm for Pairwise Protein Structure Alignment
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 19
%P 31-40
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Although the pairwise protein three-dimensional (3D) structure alignment is vital in structural bioinformatics, its complexity is categorized as non-deterministic polynomial-time hard (NP-hard). Hence, researchers strive to develop algorithms to overcome the heavy computation complexity. Most of their attempts tend to achieve more accurate alignment results regardless of the computational execution time. Therefore, finding a fast alignment algorithm with accurate results is still an outstanding task. Recently, General Purpose Graphical Processing Units (GPGPUs) can execute the many time-consuming algorithms faster than the CPUs can. This paper proposes the GPU-based implementation of the MatAlign algorithm which is based on the two-level alignment of protein. This GPU implementation yields about 11 increase in speed over its CPU-based, single-core implementation on GPU GeForce GTX 860M (640 cores, 2GB RAM) and Intel Core i7-4710HQ (2.50GHz, 8GB RAM, 8 cores) CPU. In order to achieve more accurate results, PTM-MatAlign is implemented to use the Template Modeling Score (TM-score) instead of the MatAlign regular score function.

References
  1. Morange, M. 1999. A History of Molecular Biology. BioScience, 49(11), 929-931.
  2. Burkowski, F.J. 2009. Structural bioinformatics: an algorithmic approach (Vol. 20): Chapman & Hall/CRC.
  3. Liisa , H., and S. Chris. 1993. Protein Structure Comparison by Alignment of Distance Matrices. Journal of Molecular Biology, 233(11), 123–138.
  4. Orengo, C.A., and W.R. Taylor. 1996. SSAP: sequential structure alignment program for protein structure comparison. Methods Enzymol, 266, 617-635.
  5. Zhang, Y., and J. Skolnick. 2005. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research, 33(7), 2302-2309.
  6. Jianhua, Z., and W. Zhiping. 2004. FAST: A novel protein structure alignment algorithm. Proteins: Structure, Function, and Bioinformatics, 58(3), 618-627.
  7. Zeyar, A., and T. Kian-Lee. 2006. MatAlign: PRECISE PROTEIN STRUCTURE COMPARISON BY MATRIX ALIGNMENT. Journal of Bioinformatics and Computational Biology, 4(6), 1197-1216.
  8. Nicolas, B., and M. Pierre-Francois. 2012. LNA: Fast Protein Structural Comparison Using a Laplacian Characterization of Tertiary Structure. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(5), 1451-1458.
  9. Inken , W., M.-D. Noel , and A. Rumen 2012. CSA: comprehensive comparison of pairwise protein structure alignments. Nucleic Acids Research, 303-309.
  10. Genki, T., and T.-S. Mayuko. 2015. CAB-Align: A Flexible Protein Structure Alignment Method Based on the Residue-Residue Contact Area. PLoS ONE, 10(10).
  11. Jean-Francois, G., M. Thomas, and H.B. Stephen. 1996. Surprising similarities in structure comparison. Current Opinion in Structural Biology, 6(3), 377-385.
  12. Shindyalov, I.N., and P.E. Bourne. 1998. Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng, 11(9), 739-747.
  13. Yuzhen, Y., and G. Adam. 2003. Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics, 19(2), 246-255.
  14. Maxim, S., N. Ruth, and J.W. Haim. 2004. FlexProt: Alignment of Flexible Protein Structures Without a Predefinition of Hinge Regions. Journal of Computational Biology, 11(1), 83-106.
  15. Mathilde, C., B. Sophie, and P. Joël. 2005. YAKUSA: A fast structural database scanning method. Proteins: Structure, Function, and Bioinformatics, 61, 137-151.
  16. Matthias, H., and F. Dmitrij. 2004. STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Research, 32, 500-502.
  17. Zheng, W. 2008. CLePAPS: Fast Pair Alignment of Protein Structures Based on Conformational Letters. Journal of Bioinformatics and Computational Biology, 6(2), 347-366.
  18. Potestio, R., et al. 2010. ALADYN: a web server for aligning proteins by matching their large-scale motion. Nucleic Acids Research, 38(2), 41-45.
  19. Paweł, D., and L. Bogdan. 2011. A novel method to compare protein structures using local descriptors. BMC Bioinformatics, 12(344).
  20. Shintaro, M., S. Kengo, and C. George. 2013. MICAN: a protein structure alignment algorithm that can handle multiple-chains, inverse alignments, Ca only models, alternative alignments, and non-sequential alignments. BMC Bioinformatics, 14(24).
  21. Silvia, C., et al. 2013. ASSIST: a fast versatile local structural comparison tool. Bioinformatics, 30(7).
  22. Noel, M.-D., and P. Natasa. 2014. GR-Align: fast and flexible alignment of protein 3D structures using graphlet degree similarity. Bioinformatics, 30(9), 1259-1265.
  23. Shanshan, C., Z. Yang, and B. Charles. 2015. PCalign: a method to quantify physicochemical similarity of protein-protein interfaces. BMC Bioinformatics, 16(33).
  24. Wohlers, I., F.S. Domingues, and W.K. Gunnar 2010. Towards optimal alignment of protein structure distance matrices. Bioinformatics, 26, 2273-2280.
  25. Linial, K. 20004. Approximate Protein Structural Alignment in Polynomial Time. Proceedings of the National Academy of Sciences of the United States of America(101), 12201–12206.
  26. Peter, A.J., P.R. Andrew, and G.M. Helen. 1994. A Graph-theoretic Approach to the Identification of Three-dimensional Patterns of Amino Acid Side-chains in Protein Structures. Journal of Molecular Biology, 243(2), 327–344.
  27. Andrew , W.C., B. Neera , and T. Janet 1997. TESS: a geometric hashing algorithm for deriving 3D coordinate templates for searching structural databases. Application to enzyme active sites. Protein Science, 6(11), 2308-2323.
  28. Nathaniel, L., N. Ruth , and W.J. Haim 2001. MUSTA - A General, Efficient, Automated Method for Multiple Structure Alignment and Detection of Common Motifs: Application to Proteins. Journal of Computational Biology, 8(2), 93-121.
  29. Jonathan, B.A., and T.M. Janet 2003. An algorithm for constraint-based structural template matching: application to 3D templates with statistical analysis. Bioinformatics, 19(13), 1644-1649.
  30. Pramod , W.P., et al. 2003. Functional Sites in Protein Families Uncovered via an Objective and Automated Graph Theoretic Approach. Journal of Molecular Biology, 326(3), 955–978.
  31. Alexander , S., and R.B. Robert. 2003. Annotation in three dimensions. PINTS: Patterns in Non-homologous Tertiary Structures. Nucleic Acids Research, 31(13), 3341-3344.
  32. Nurcan, T., et al. 2011. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM. Nat. Protocols, 6(9), 1341-1354.
  33. Kundrotas, P.J., Z. Zhengwei, and J. Joël. 2012. Templates are available to model nearly all complexes of structurally characterized proteins. Proceedings of the National Academy of Sciences of the United States of America, 109(24), 9438-9441.
  34. Kundrotas, P.J., and I.A. Vakser. 2013. Global and local structural similarity in protein–protein complexes: Implications for template-based docking. Proteins, 81(12), 2137-2142.
  35. Bagley, S.C., and R.B. Altman. 1995. Characterizing the microenvironment surrounding protein sites. Protein Science, 4(4), 622-635.
  36. Binkowski, A.T., L. Adamian, and J. Liang. 2003. Inferring Functional Relationships of Proteins from Local Sequence and Spatial Surface Patterns. Journal of Molecular Biology, 332(2), 505–526.
  37. Glaser, F., et al. 2006. A method for localizing ligand binding pockets in protein structures. Proteins: Structure, Function, and Bioinformatics, 62(2), 479–488.
  38. Xie, L., and P.E. Bourne. 2008. Detecting evolutionary relationships across existing fold space, using sequence order-independent profile–profile alignments. Proceedings of the National Academy of Sciences of the United States of America, 105(14), 5441-5446.
  39. La, D., et al. 2009. 3D-SURFER: software for high-throughput protein surface comparison and analysis. Bioinformatics, 25(21), 2843-2844.
  40. Mernberger, M., G. Klebe, and E. Hullermeier. 2011. SEGA: Semiglobal Graph Alignment for Structure-Based Protein Comparison. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 8(5), 1330-1343.
  41. Hu, B., et al. 2014. PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions. International Journal of Molecular Sciences, 15(9), 15122-15145.
  42. Karampudi, N.B.R., and R.P. Bahadur. 2015. Layers: A molecular surface peeling algorithm and its applications to analyze protein structures. Scientific Reports, 5.
  43. Alex , S.D., P.J. Stuckey, and A.I. Wirth. 2010. Fast and accurate protein substructure searching with simulated annealing and GPUs. BMC Bioinformatics, 11, 446.
  44. Bin, P., et al. 2012. Accelerating large-scale protein structure alignments with graphics processing units. BMC Research Notes, 5(1), 116.
  45. Mrozek, D., M. Brożek, and B. Małysiak-Mrozek. 2014. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA. Journal of Molecular Biology, 20(2), 2067.
  46. Saul, N.B., and W.D. Christian. 1970. A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 48(3), 443-453.
  47. Rachel, K., K. Patrice, and L. Michael. 2005. Comprehensive Evaluation of Protein Structure Alignment Methods: Scoring by Geometric Measures. Journal of Molecular Biology, 346, 1173-1188.
  48. Alexandrov, N.N., and D. Fischer. 1996. Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures. Proteins, 25(3), 354-365.
  49. Kabsch, W. 1978. A discussion of the solution for the best rotation to relate two sets of vectors. Acta Crystallographica, 34, 827-828.
  50. Skolnick, J. 2004. Scoring function for automated assessment of protein structure template quality. Proteins, 57, 702-710.
  51. Michael, L., and G. Mark. 1998. A unified statistical framework for sequence comparison and structure comparison. Proceedings of the National Academy of Sciences of the United States of America, 95(11), 5913-5920.
  52. Dhraief, A., R. Issaoui, and A. Belghith. 2011. Parallel Computing the Longest Common Subsequence (LCS) on GPUs: Efficiency and Language Suitability. INFOCOMP.
  53. Fischer, D., et al. 1996. Assessing the performance of fold recognition methods by means of a comprehensive benchmark. Pacific Symposium on Biocomputing, 300-318.
  54. Cheng, H., B.H. Kim, and N.V. Grishin. 2008. MALISAM: A Database of Structurally Analogous Motifs in Proteins. Nucleic Acids Research, 36, 211-217.
  55. Murzin, A.G., S.E. Brenner, and T. Hubbard. 1995. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol(247), 536–540.
Index Terms

Computer Science
Information Sciences

Keywords

Structure Alignment CUDA GPU Parallel Computing TM-Score MatAlign.