CFP last date
20 December 2024
Reseach Article

Cluster Analysis Method for Multiple Sequence Alignment

by Sita Rani, Simarjeet Kaur
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 43 - Number 14
Year of Publication: 2012
Authors: Sita Rani, Simarjeet Kaur
10.5120/6171-8595

Sita Rani, Simarjeet Kaur . Cluster Analysis Method for Multiple Sequence Alignment. International Journal of Computer Applications. 43, 14 ( April 2012), 19-25. DOI=10.5120/6171-8595

@article{ 10.5120/6171-8595,
author = { Sita Rani, Simarjeet Kaur },
title = { Cluster Analysis Method for Multiple Sequence Alignment },
journal = { International Journal of Computer Applications },
issue_date = { April 2012 },
volume = { 43 },
number = { 14 },
month = { April },
year = { 2012 },
issn = { 0975-8887 },
pages = { 19-25 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume43/number14/6171-8595/ },
doi = { 10.5120/6171-8595 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:33:24.347435+05:30
%A Sita Rani
%A Simarjeet Kaur
%T Cluster Analysis Method for Multiple Sequence Alignment
%J International Journal of Computer Applications
%@ 0975-8887
%V 43
%N 14
%P 19-25
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

With the addition of more data in the field of proteomics, the computational methods need to be more efficient. The fraction or the part of molecular sequence that is more resistant to change is functionally more important to the molecule. Comparative approaches are used to ensure the reliability of sequence alignment. The problem of multiple sequence alignment (MSA) is a proposition of evolutionary history. The explicit homologous correspondence of each individual sequence position is established for each column in the alignment. In the present work, the different pair-wise sequence alignment methods are discussed. The limitation of these methods is that they are capable for aligning the limited number of sequences having small sequence length. A new method is proposed for sequence alignment based on the local alignment with consensus sequence. The triticum wheat varieties sequences are considered which are loaded from the NCBI databank. The dataset is divided into two parts and two phylogenetic trees are constructed for each dataset. Using advanced pruning techniques, a single tree is constructed from the two trees generated. Then by applying the threshold condition, the closely related sequences are extracted and optimal MSA is obtained using shift operations in both directions.

References
  1. B. Bergeron, Bioinformatics Computing, Pearson Education, 2003, pp. 110- 160.
  2. Clare, A. Machine learning and data mining for yeast functional genomics Ph. D. thesis, University of Wales, 2003.
  3. Han, J. and Kamber M. , Data Mining: Concepts and Techniques, Morga Kaufmann Publishers, 2004, pp. 19-25.
  4. Jiang, D. Tang, C. and Zhang, A. , "Cluster Analysis for Gene Expression Data", IEEE Transactions on knowledge and data engineering, vol. 11, 2004, pp. 1370-1386.
  5. Kai, L. and Li-Juan, C. , "Study of Clustering Algorithm Based on Model Data", International Conference on Machine Learning and Cybernetics, Honkong, 2007, Volume 7, No. 2. , 3961-3964
  6. Kantardzic, M. , Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley & Sons, 2000, pp. 112-129.
  7. Krane, D. and Raymer, M. , Fundamental Concepts of Bioinformatics, Pearson Education Publishers, 2006, pp. 1-98.
  8. Morzy, M. , Czejdo, B. ,Wojciechowski, M. and Zakrzewicz, M. , "Materialized Views in Data Mining", Proceedings of the 13th International Workshop on Database and Expert Systems Applications DEXA-VLDWH, Aix-en-Provence, France, IEEE Press, 2000, pp. 827-831.
Index Terms

Computer Science
Information Sciences

Keywords

Multiple Sequence Alignment Local Alignment Ncbi Data Bank Phylogenetic Tree