International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 1 - Number 22 |
Year of Publication: 2010 |
Authors: Geetha.T, Michael Arock |
10.5120/436-665 |
Geetha.T, Michael Arock . Enhanced Hierarchical Clustering for Gene Expression data. International Journal of Computer Applications. 1, 22 ( February 2010), 92-98. DOI=10.5120/436-665
Micro arrays are used to assess the transcriptome of many biological systems that has generated an enormous amount of data. Cluster analysis is a technique used to group and analyze micro array data. Identification of groups of genes that manifest similar expression patterns is a key step in the analysis of gene expression data. Hierarchical clustering is the one of the clustering techniques used for this purpose. In this paper, we design an enhanced hierarchical clustering algorithm which scans the dataset and calculates distance matrix only once unlike other papers, (up to authors' knowledge). Our main contribution is to reduce time, even when a large database is analyzed. Also, the results of hierarchical clustering are represented as a binary tree which gives clarity in grouping and further helps to find clustered objects easily. Our algorithm is able to retrieve number of clusters with the help of cut distance and measures the quality with validation index in order to obtain the best one; does not require initial parameter like number of clusters.