CFP last date
20 January 2025
Reseach Article

A Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction

by Mridu Gupta, Shailendra Singh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 52 - Number 9
Year of Publication: 2012
Authors: Mridu Gupta, Shailendra Singh
10.5120/8229-1303

Mridu Gupta, Shailendra Singh . A Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction. International Journal of Computer Applications. 52, 9 ( August 2012), 14-18. DOI=10.5120/8229-1303

@article{ 10.5120/8229-1303,
author = { Mridu Gupta, Shailendra Singh },
title = { A Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction },
journal = { International Journal of Computer Applications },
issue_date = { August 2012 },
volume = { 52 },
number = { 9 },
month = { August },
year = { 2012 },
issn = { 0975-8887 },
pages = { 14-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume52/number9/8229-1303/ },
doi = { 10.5120/8229-1303 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:51:49.394719+05:30
%A Mridu Gupta
%A Shailendra Singh
%T A Novel Genetic Algorithm based Approach for Optimization of Distance Matrix for Phylogenetic Tree Construction
%J International Journal of Computer Applications
%@ 0975-8887
%V 52
%N 9
%P 14-18
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Phylogenies are useful for organizing knowledge of biological diversity, for structuring classifications, and for providing knowledge of events that occurred during evolution. Different phylogenetic reconstruction techniques are available. In this paper Distance based technique is used. Distance measure is an important issue in phylogenetic analysis. Traditional approaches are time-consuming due to the fact that they require multiple sequence alignment, while the K-tuple distance is easy to compute and has been used in phylogenetic tree reconstruction. Based on this K- tuple distance, a genetic algorithm is proposed to find a new F-tuple distance measure which takes into account the position of occurrence of tuples and instead of considering difference, similarities between the sequences are considered. The K-tuple distance approach is not effective for set of sequences which are almost identical where as F-tuple distance is useful for constructing phylogenetic tree for set of identical sequences. This novel approach is capable of efficiently building phylogenetic trees and is less computational intensive.

References
  1. AnkitaJiwan, Shailendra Singh, 2012. "A Review on RNA Pseudoknot Structure Prediction Techniques, International Conference on Computing",Electronics and Electrical Technologies [ICCEET], IEEE sponsored, 975-978.
  2. Mridu Gupta, Shailendra Singh, 2012. "Computational Approaches for Phylogenetic Tree Construction: A Review", International Conference on Recent Technology 2012 [ICORT].
  3. Leo van Iersel, Judith Keijsper, Steven Kelk, LeenStougie, Ferry Hagen, and TeunBoekhout, "Constructing Level-2 Phylogenetic Networksfrom Triplets", IEEE/ACM Transactions on Computational Biology and Bioinformatics, 6(4),2009.
  4. Chen Yang and Sami Khuri, "PTC: An Interactive Tool for Phylogenetic Tree Construction", Journal of Computational Systems Bioinformatics (CSB'03), 2003.
  5. B. Larget and D. Simon, "Markov Chain Monte Carlo Algorithms for the Bayesian Analysis of Phylogenetic Trees", Journal of Mol Biology Evolution 16, 1999, 750-759.
  6. Anupam Bhattacharjee, Kazi Zakia Sultana and Zalia Shams, 2006. "Dynamic and Parallel Approaches to Optimal Evolutionary Tree Construction", Ieee/Ccece/Ccgei, Ottawa.
  7. Andreas D. Baxevanis and B. F. Francis Ouellette, 2001. Bioinformatics a Practical Guide to the Analysis of Genes and Proteins. Willey inter science publication, second edition.
  8. Baldauf SL, "Phylogeny for the faint of heart: a tutorial", Journal of Trends in Genetics 19 (6), 2003, 345-351.
  9. Chuang Peng, 2007. Distance Based Methods in Phylogenetic Tree Construction.
  10. Hall, BG, 2004. Phylogenetic Trees Made Easy: A How-To Manual, 2nd ed. Sinauer Associates, Inc. : Sunderland.
  11. T. Jukes and C. Cantor, 1969. Evolution of protein molecules, Mammalian Protein Metabolism (ed. HN Munro), New York: Academic Press, 21-32.
  12. B. Snel, P. Bork and M. A. Huynen, "Genome phylogeny based on gene content", Journal of Nal. Genet. 21, 1999, 108-110.
  13. S. T. Fitz-Gibbon and C. H. House, "Whole genome-based phylogenetic analysis of free-living microorganism", Journal of Nucleic Acidy Res. 27, 1999, 4218-4222, (1999).
  14. B. Snel. P. Bork and M. A. Huynen, "Genomes in flux: the evolution of archival and proteobacterial gene content", Journal of Genome Res. 12, 2002, 17-25.
  15. J. Lin and M. Gerstein, "Whole-genome trees based on the occurrence of folds and orthologs: implications for comparing genomes on different levels", Journal of Genome Res. 10, 2000, 808-818.
  16. F. Tekaia, A. Lazacano and B. Dujon, "The genomic tree as revealed from whole protein comparisons", Journal of Genome Res 9, 1999, 550-557.
  17. J. M. Yang and C. Y. Kao, "A family competition evolutionary Algorithm for automated docking of flexible ligands to proteins", IEEE Trans. Inf. Technol. Biomedecine. 4(3), 2000, 225–237.
  18. N. Saitou and M. Nei, "The Neighbor-Joining Method: A New Method for Reconstructing Phylogenetic Trees", Journal of Mol. Biology Evolution 4(4), 1987, 406-425.
  19. C. Jill Harrison and Jane A. Langdale, "A Step By Step Guide to Phylogeny Reconstruction", Plant Journal 45, 2006, 561-572.
  20. Kuan Yang and Liqing Zhang,, " Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction", Journal of Nucleic Acids Research 36(5), 2008.
  21. D. Bhandari, C. A. Murthy, and S. K. Pal, "Genetic algorithm with elitist model and its convergence", Int. J. Pattern Rcognit. Artif. Intell. 10(6), 1996, 731–747.
  22. L. B. Booker, D. E. Goldberg and J. H. Holland, "Classifier systems and genetic algorithms", Artif. Intell, 40(1–3), 1989, 235–282.
  23. S. B. Needleman and C. D. Wunsch,"A general method applicable to the search for similarities in the amino acid sequence of two proteins", J. Mol. Biology 48,1970, 443–453.
  24. T. F. Smith and M. S. Waterman, 2001. Identification of common Informatics: Edmonton. AB, Canada: IMIA, 83– 100.
  25. T. Murata and H. Ishibuchi, "Positive and negative combination effects of crossover and mutation operators in sequencing problems", Journal of Evol. Computation 20, 1996, 170–175.
  26. H. D. Nguyen, I. Yoshihara, K. Yamamori and M. Yasunaga, 2002. A parallel hybrid genetic algorithm for multiple protein sequence alignment, In Proc. Congress Evolutionary Computation, 309–314.
  27. V. G. Levitsky and A. V. Katokhin, "Recognition of eukaryotic promoters using a genetic algorithm based on iterative discriminant analysis", Journal of Silico Biology 3(1–2), 2003, 81–87.
  28. H. K. Tsai, J. M. Yang, Y. F. Tsai and C. Y. Kao ,"An evolutionary approach for gene expression patterns", IEEE Trans. Inf. Technol. Biomedicine 8(2), 2004, 69–78.
  29. Leping Li, Yu Liang and Robert L. Bass, "GAPWM: a genetic algorithm method for optimizing a position weight matrix", Journal of Bioinformatics 23(10), 2007, 1188-1194.
  30. K. C. Wiese and E. Glen, "A permutation-based genetic algorithm for the RNA folding problem: A critical look at selection strategies, crossover operators, and representation issues", Journal of Biosystems 72(1–2), 2003, 29– 41.
  31. C. Anselmi, G. Bocchinfuso, P. De Santis, M. Savino and A. Scipioni, "A theoretical model for the prediction of sequence-dependent nucleosome thermodynamic stability", Journal of Biophysics 79(2), 2000, 601–613.
  32. S. Schulze-Kremer, "Genetic algorithms and protein folding. Methods in molecular biology", Journal of Protein Structure Prediction: Methods and Protocols 143, 2000, 175–222.
Index Terms

Computer Science
Information Sciences

Keywords

Distance Method Distance Matrix Phylogenetics Phylogenetic Tree K-tuple distance Genetic Algorithm F-tuple distance