CFP last date
20 December 2024
Reseach Article

Two-Way Clustering Analysis using Parallel Fuzzy Approach for Microarray Gene Expression Data

by Dwitiya Tyagi-Tiwari, Sujoy Das, Namita Srivastava
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 124 - Number 9
Year of Publication: 2015
Authors: Dwitiya Tyagi-Tiwari, Sujoy Das, Namita Srivastava
10.5120/ijca2015905483

Dwitiya Tyagi-Tiwari, Sujoy Das, Namita Srivastava . Two-Way Clustering Analysis using Parallel Fuzzy Approach for Microarray Gene Expression Data. International Journal of Computer Applications. 124, 9 ( August 2015), 39-45. DOI=10.5120/ijca2015905483

@article{ 10.5120/ijca2015905483,
author = { Dwitiya Tyagi-Tiwari, Sujoy Das, Namita Srivastava },
title = { Two-Way Clustering Analysis using Parallel Fuzzy Approach for Microarray Gene Expression Data },
journal = { International Journal of Computer Applications },
issue_date = { August 2015 },
volume = { 124 },
number = { 9 },
month = { August },
year = { 2015 },
issn = { 0975-8887 },
pages = { 39-45 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume124/number9/22134-2015905483/ },
doi = { 10.5120/ijca2015905483 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:13:58.026771+05:30
%A Dwitiya Tyagi-Tiwari
%A Sujoy Das
%A Namita Srivastava
%T Two-Way Clustering Analysis using Parallel Fuzzy Approach for Microarray Gene Expression Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 124
%N 9
%P 39-45
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A microarray measures the expression levels of thousands of genes at the same time. Clustering helps to analyze microarray gene expression data. The characteristic of gene expression data is its coherent structure with regards to genes and samples. In this paper we have implemented a biclustering algorithm to identify subgroups of data which shows correlated behavior under specific experimental conditions. In the process of finding biclusters, Fuzzy C-means clustering is used to cluster the genes and samples with maximum membership function. Dimensionality and reducing the gene shaving is done using principal component analysis & gene filtering with the function respectively. This method obtains highly correlated sub matrices of the gene expression dataset. It is also observed that it identifies important co-regulated genes and samples at the same time. Principal component analysis is also verified the concatenation of small biclusters into bigger one. Biclustering is a NP-hard problem [10] therefore we have implemented biclustering in multi-core parallel environment to reduce the computational time of the algorithm. Data level and task level parallelism is used to develop the algorithm on MATLAB Parallel computing toolbox with multicore platform. We have compared the results with other parallel & sequential algorithm to show the effectiveness of the algorithm.

References
  1. Daxin Jiang; Chun Tang; Aidong Zhang, "Cluster analysis for gene expression data: a survey," Knowledge and Data Engineering, IEEE Transactions on, vol.16, no.11, 1370-1386, Nov. 2004.
  2. Daxin Jiang, Jian Pei and Aidong Zhang, "GPX: Interactive Mining of Gene Expression Data", 30th VLDB Conference, Toronto, Canada, 2004.
  3. G. Kerr, H.J. Ruskin, M. Crane and P. Doolan, “Techniques for Clustering Gene Expression Data”, Computers in Biology and Medicine 38, 2008, 283-293.
  4. Erfaneh Naghieh and Yonghong Peng, "Microarray Gene Expression Data Mining: Clustering Analysis Review", Aug 20, 2009.
  5. Chun Tang and Aidong Zhang, “Interrelated Two-Way Clustering and Its Application on Gene Expression Data ", International Journal on Artificial Intelligence Tools, 2005; Vol. 14, No. 4; 577-598.
  6. B. Chandra, S. Shankera, Saroj Mishra, "A new approach: Interrelated two-way clustering of gene expression data", Statistical Methodology 3, 2006, 93–102.
  7. Y. Cheng and G.M. Church, “Biclustering of Expression Data”, in Proc. Of American Association for Artificial Intelligence, 2000.
  8. Y. Kluger, R. Basri, J.T. Chang, M. Gerstein, Spectral biclustering of microarray data: coclustering genes and conditions, Genome Res. 13 (4), 2003 703–716.
  9. Liu Wei And Chen Ling, "A Parallel Algorithm For Gene Expressing Data Biclustering", Journal Of Computers, Vol. 3, No. 10, October 2008, 71-77.
  10. Sara C. Madeira and Arlindo L. Oliveira, "Biclustering Algorithms for Biological Data Analysis: A Survey", IEEE/Acm Transactions on Computational Biology and Bioinformatics Vol 1, No. 1, January-March 2004, 24-45.
  11. A.H. Tewfik, A.B. Techagang and I. Vertatsehitsch, “Parallel Identification of Gene Biclusters with Coherent Evolutions”, IEEE Transaction on Signal Processing, Vol. 54, No. 6, June-2006.
  12. Tavazoie,S., Hughes,J.D., Campbell,M.J., Cho,R.J. and Church,G.M. (1999) Systematic determination of genetic network architecture. Nat. Genet., 22, 281–285.
  13. Zhang Yanjie, Veronique Prinet and Wu Shuanhu, “A Principal Component Analysis Based Microarray Data Bi-Clustering Method”, 2nd International Conference on Biomedical Engineering and Informatics, October 2009.
  14. K. Y. Yeung and W. L. Ruzzo, “Principal Component Analysis for Clustering Gene Expression Data”, Bioinformatics Vol. 17 no. 9 2001, 763-774.
  15. Genevera I, Allen and Mirjana Maletic-Savatic, “Sparse non-negative generalized PCA with applications to metabolomics”, Bioinformatics, Vol. 27 no. 21 2011, 3029-3035.
  16. Asa Ben-Hur and Isabelle Guyon, “Detecting Stable Clusters Using Principal Component Analysis”, In Functional Genomics: Methods and Protocols. M.J. Brownstein and A. Kohodursky (eds.) Humana press, 2003, 159-182.
  17. Christoph Bartenhagen, Hans-Ulrich Klein, Christian Rckert, Xiaoyi Jiang and Martin Dugas, “Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data”, BMC Bioinformatics 2010.
  18. Dwitiya Tyagi, Sujoy Das, and Namita Srivastava, Parallel Two-way Clustering for Microarray Gene expression data’, International Journal of Computer Science Trends and Technology, Vol. 3 Issue 3, May-June 2015.
  19. Wei Shen, Guixia Liu, Ming Zheng, Zhangxu Li, Yi Zhong, Jianan Wu, Chunguang Zhou, A Novel Biclustering Algorithm and Its Application in Gene Expression Profles, Journal of Information & Computational Science 9: 11 (2012), 3113–3122.
  20. Bezdec, J.C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, 1981.
  21. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ, Davis RW, ‘A genome-wide transcriptional analysis of the mitotic cell cycle’, Molecular Cell, Vol. 2, July, 1998, 65–73.
  22. Hartigan J.: “Direct Clustering of a Data Matrix”, J Am Stat Assoc, 67(337), pp. 123-129, 1972.
Index Terms

Computer Science
Information Sciences

Keywords

Microarray gene expression Multicore platform Biclustering MATLAB parallel computing PCA gene entropy.