We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 December 2024
Reseach Article

Analysis of Gene Expression Microarray Dataset for Feature Selection

Published on November 2012 by G. Baskar, P. Ponmuthuramalingam
National Conference on Communication Technologies & its impact on Next Generation Computing 2012
Foundation of Computer Science USA
CTNGC - Number 3
November 2012
Authors: G. Baskar, P. Ponmuthuramalingam
430a079d-318b-4d6c-87a1-9db126ee3c5e

G. Baskar, P. Ponmuthuramalingam . Analysis of Gene Expression Microarray Dataset for Feature Selection. National Conference on Communication Technologies & its impact on Next Generation Computing 2012. CTNGC, 3 (November 2012), 33-35.

@article{
author = { G. Baskar, P. Ponmuthuramalingam },
title = { Analysis of Gene Expression Microarray Dataset for Feature Selection },
journal = { National Conference on Communication Technologies & its impact on Next Generation Computing 2012 },
issue_date = { November 2012 },
volume = { CTNGC },
number = { 3 },
month = { November },
year = { 2012 },
issn = 0975-8887,
pages = { 33-35 },
numpages = 3,
url = { /proceedings/ctngc/number3/9068-1032/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Communication Technologies & its impact on Next Generation Computing 2012
%A G. Baskar
%A P. Ponmuthuramalingam
%T Analysis of Gene Expression Microarray Dataset for Feature Selection
%J National Conference on Communication Technologies & its impact on Next Generation Computing 2012
%@ 0975-8887
%V CTNGC
%N 3
%P 33-35
%D 2012
%I International Journal of Computer Applications
Abstract

Microarray is a powerful technology for biological exploration which enables to simultaneously measure the level of activity of thousands genes in various cancer study . clustering is important data mining technique to extract useful information from various high dimensional datasets. A wide range of clustering algorithm is available and still in an open area of research k-Means algorithm is one of the basic and most simple partitioning clustering technique is given by Mac Queen in 1967. In this paper a sample weighting and efficient margin based sample weighting algorithm to improve the stability of feature selection. We proposed a weighted k-means to improve the cluster stability and presented an experimental evaluation of the proposed method, the experiment of microarray dataset show the feature selection algorithm such as SVM-RFE are more stable in gene selection.

References
  1. T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, pp. 531-537, 1999.
  2. T. Li, C. Zhang, and M. Ogihara, "A Comparative Study of Feature Selection and Multiclass Classification Methods for Tissue Classification Based on Gene Expression," Bioinformatics, vol. 20, pp. 2429-2437, 2004.
  3. Y. Saeys, I. Inza, and P. Larranaga, "A Review of Feature Selection Techniques in Bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.
  4. H. Liu, J. Li, and L. Wong, "A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns," Genome Informatics, vol. 13, pp. 51-60, 2002.
  5. P. A. Mundra and J. C. Rajapakse, "SVM-RFE with MRMR Filter for Gene Selection," IEEE Trans. NanoBioscience, vol. 9, no. 1, pp. 31- 37, Mar. 2010
  6. I. H. Witten and E. Frank, Data Mining - Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, 2005.
  7. B. Y. Rubinstein, Simulation and the Monte Carlo Method. John Wiley & Sons, 1981.
  8. Y. Tang, Y. Q. Zhang, and Z. Huang, "Development Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 3, pp. 365-381, July 2007.
  9. Pawan Lingras, Chad West. Interval set Clustering of Web users with Rough k-Means, submitted to the Journal of Intelligent Information System in 2002.
  10. Yeung K. Y, Haynor D. R, Ruzzo W. L. Validating clustering for gene expression data. Bioinformatics. 2001.
Index Terms

Computer Science
Information Sciences

Keywords

Feature Selection Classification Clustering Gene Expression Microarray