CFP last date
20 December 2024
Reseach Article

Mining Dense Patterns from Off Diagonal Protein Contact Maps

by M. Om Swaroopa, K. Suvarna Vani
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 49 - Number 12
Year of Publication: 2012
Authors: M. Om Swaroopa, K. Suvarna Vani
10.5120/7682-0987

M. Om Swaroopa, K. Suvarna Vani . Mining Dense Patterns from Off Diagonal Protein Contact Maps. International Journal of Computer Applications. 49, 12 ( July 2012), 36-41. DOI=10.5120/7682-0987

@article{ 10.5120/7682-0987,
author = { M. Om Swaroopa, K. Suvarna Vani },
title = { Mining Dense Patterns from Off Diagonal Protein Contact Maps },
journal = { International Journal of Computer Applications },
issue_date = { July 2012 },
volume = { 49 },
number = { 12 },
month = { July },
year = { 2012 },
issn = { 0975-8887 },
pages = { 36-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume49/number12/7682-0987/ },
doi = { 10.5120/7682-0987 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:46:08.154392+05:30
%A M. Om Swaroopa
%A K. Suvarna Vani
%T Mining Dense Patterns from Off Diagonal Protein Contact Maps
%J International Journal of Computer Applications
%@ 0975-8887
%V 49
%N 12
%P 36-41
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The three dimensional structure of proteins is useful to carry out the biophysical and biochemical functions in a cell. Protein contact maps are 2D representations of contacts among the amino acid residues in the folded protein structure. Proteins are biochemical compounds consisting of one or more polypeptides, facilitating a biological function. Many researchers make note of the way secondary structures are clearly visible in the contact maps where helices are seen as thick bands and the sheets as orthogonal to the diagonal. In this paper, we explore several machine learning algorithms to data driven construction of classifiers for assigning protein off diagonal contact maps. A simple and computationally inexpensive algorithm based on triangle subdivision method is implemented to extract twenty features from off diagonal contact maps. This method successfully characterizes the off-diagonal interactions in the contact map for predicting specific folds. NaiveBayes, J48 and REPTree classification results with Recall 76. 38%, 91. 66% and 80. 32% are obtained respectively.

References
  1. Lo Conte L, et al. 2000. SCOP:A structural Classification of Proteins database. Nucleic Acids Res. ;28:257-259
  2. Barah P, Sinha S. 2008: Analysis of protein folds using protein contact networks, Pramana, 71 (2):369-78.
  3. Shi J-Y, Zhang Y-N. Fast SCOP classification of structural class and fold using secondary structure mining in distance matrix, PRIB2009, LNBI 2009, 5780:344-353.
  4. Hu J, Shen X, Shao Y, Bystro C, Zaki MJ. 2002: Mining protein contact maps. In: Zaki MJ, Wang JTL, Toivonen HTT, Eds. Second BIOKDD Workshop on Data Mining in Bioinformatics. Edmonton, Alberta, Canada; 310.
  5. S. D. Bhavani and K. Suvarnavani, Somdatta Sinha. 2011. Mining of protein contact maps for protein fold prediction. WIREs Data Mining and Knowledge Discovery, John Wiley & Sons, Volume 1, Pages 362-368, July-August.
  6. N. Gupta, N. Mangal and S. Biswas, 2005: Evolution and similarity evaluation of protein structures in contact map space, Proteins, vol59(2), pp. 196-204.
  7. U. Gobel, C. Sander, R. Schneider, A. Valencia, , 1994: Correlated mutations and residue contacts in proteins, Proteins, vol. 18(4). Pp. 309-317.
  8. Y. Zhao and G. Karypis, , 2003:Prediction of Contact Maps Using Support Vector Machines, in proc of third IEEE Symposium on Bioinformatics and Bioengineering, pp. 22-23.
  9. A. Vullo, I. Walsh and G. Pollastri, , 2006: A two-stage approach for improved pre-diction of residue contact maps, BMC Bioinformatics, vol. 7:180.
  10. P. Fariselli, O. Olmea, A. Valencia and R. Casadio, , 2001: Prediction of contact maps with neural networks and correlated mutations, Protein Engineering, vol 14(11), pp. 835843.
  11. Vendruscolo M, Subramanian B, Kanter I, Domany E,Lebowitz J. 1999: Statistical properties of contact maps. Phys Rev E, 59:977984.
  12. Fraser R, Glasgow J, 2007:A demonstration of clustering in protein contact maps for alpha helix paris, ICANNGA 2007, LNCS, 4431: 758-766.
  13. Ding C H Q, Dubchak I. , 2001: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17:349-358.
  14. Alpaydin E. 2005. Introduction to Machine Learning. MIT Press, Prentice-Hall of India, Eastern Economy Edition series.
  15. Shamim MTA, Anwaruddin M, Nagarajaram HA. 2007:Support vector machine-based classification of protein folds using the structural properties of amino acid residues and amino acid residue pairs, Bioinformatics, 23:24 3320-3327.
  16. M. J. Zaki, V. Nadimpally, D. Bardhan,and C. Bystroff. , 2005: Predicting protein folding pathways. In Data Mining in Bioinformatics, pages 127-141. Springer-Verlag London Ltd.
  17. L. Bartoli, E. Capriotti, P. Fariselli, P. Martelli, and R. Casadio. 2007: The pros and cons of predicting protein contact maps. Methods Mol Biol, 413:199-217, September.
  18. M. A. K. Amer F. Al-Badarneh and M. A. Al-Hami. 2008: Improving protein 3D structure prediction accuracy using dense regions areas of secondary structures in the contact map. American Journal of Biochemistry and Biotechnology, 4: 375-384, December.
  19. Dynameomics http://www. dynameomics. org/
  20. Weka http://www. cs. waikato. ac. nz/ml/weka/
  21. Protein Data Bank http://www. rcsb. org/pdb/home/home. do
  22. Chinnasamy, A. , Sung, W. K. , Mittal, 2005 A. : Protein Structure and Fold Prediction Using Tree-Augmented Naïve Bayesian Classifier. Journal of Bioinformatics and computational Biology 3, 803-820
  23. Shi, J-Y. , Zhang, S. -W. , Pan, Q. , Liang, Y. 2006. : Protein Fold Recognition with Support vector machines Fusion Network. Progress in Biochemistry and Biophysics 33, 155-162.
  24. Huang, C. D. , Lin, c. -T. , Pal, N. R. 2003: Hierarchical Learning Architecture with Automatic Feature Selection for Multiclass Protein Fold classification. IEEE Transactions on NanoBioscience 2,221-232.
  25. Lin, K. L. , Lin, C-Y. , Huang, C. D. , Chang, H. -M. , Yang, C. ,-Y. , Lin, C. -T. , Tang, C. Y. ,Hsu, D. F. 2007. :Feature Selection and Combination Criteria for Improving Accuracy in Protein Structure Prediction. IEEE Transactions on NanoBioscience 6, 186-196
Index Terms

Computer Science
Information Sciences

Keywords

Protein Contact Maps Classification Protein Data Bank SCOP J48 REPTree Naive Bayes