CFP last date
20 January 2025
Reseach Article

A Survey on Improved Filtering Techniques for Multiclass Gene Selection

by G. V. Manoharan, R. Shanmugalakshmi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 86 - Number 18
Year of Publication: 2014
Authors: G. V. Manoharan, R. Shanmugalakshmi
10.5120/15086-3340

G. V. Manoharan, R. Shanmugalakshmi . A Survey on Improved Filtering Techniques for Multiclass Gene Selection. International Journal of Computer Applications. 86, 18 ( January 2014), 20-23. DOI=10.5120/15086-3340

@article{ 10.5120/15086-3340,
author = { G. V. Manoharan, R. Shanmugalakshmi },
title = { A Survey on Improved Filtering Techniques for Multiclass Gene Selection },
journal = { International Journal of Computer Applications },
issue_date = { January 2014 },
volume = { 86 },
number = { 18 },
month = { January },
year = { 2014 },
issn = { 0975-8887 },
pages = { 20-23 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume86/number18/15086-3340/ },
doi = { 10.5120/15086-3340 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:04:33.898259+05:30
%A G. V. Manoharan
%A R. Shanmugalakshmi
%T A Survey on Improved Filtering Techniques for Multiclass Gene Selection
%J International Journal of Computer Applications
%@ 0975-8887
%V 86
%N 18
%P 20-23
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In the field of bioinformatics, selection of genes in multiclass sample classification can be done by filtering methods using microarray data. Such approaches usually contribute to bias towards a few classes that are easily recognizable from other classes due to imbalances of strong features and sample sizes of distinct classes in a microarray data. Many methods have been used for the filter methods, as they are very commonly used in gene ranking from microarray data in multiclass problems. In this research, we discuss various methods to decompose multiclass ranking statistics into class specific statistics and then need of Pareto-front analysis for selection of genes. This mitigates the bias induced by class intrinsic characteristics of dominating classes. The need of Pareto-front analysis is to indicate on two filter criteria commonly used for gene selection: F-score and KW-score. A significant development in classification performance and reduction in redundancy among top ranked genes were achieved in experiments with both synthetic and real-benchmark data sets. The following work is analysis over the traditional and improved filter methods used for gene selection of various classes through various mechanisms available in the literature.

References
  1. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," J. Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
  2. K. -B. Duan, J. C. Rajapakse, H. Wang, and F. Azuaje, "Multiple SVM-RFE for Gene Selection in Cancer Classification with Expression Data," IEEE Trans. Nanobiosciences, vol. 4, no. 3,pp. 228-234, Sept. 2005.
  3. P. A. Mundra and J. C. Rajapakse, "SVM-RFE with MRMR Filter for Gene Selection," IEEE Trans. Nanobiosciences, vol. 9, no. 1, pp. 31-37, Mar. 2010.
  4. C. Lazar, J. Taminau, S. Meganck, D. Steenhoff, A. Coletta, C. Molter, V. de Schaetzen, R. Duque, H. Bersini, and A. Nowe, "A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 9, no. 4, pp. 1106-1119, July/Aug. 2012.
  5. T. Li, C. Zhang, and M. Ogihara, "A Comparative Study of Feature Selection and Multiclass Calssification Methods for Tissue Classification Based on Gene Expression," Bioinformatics, vol. 20, no. 15, pp. 2429-2437, 2004.
  6. S. Dudoit, J. Fridlyand, and T. P. Speed, "Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data," J Am. Statistical Assoc. , vol. 97, no. 457, pp. 77-86, 2002.
  7. D. Chen, Z. Liu, X. Ma, and D. Hua, "Selecting Genes by Test Statistics," J. Biomedicine and Biotechnology, vol. 2, pp. 132-138, 2005.
  8. C. Ding and H. Peng, "Minimum Redundancy Feature Selection from Microarray Gene Expression Data," J. Bioinformatics Computational Biology, vol. 3, pp. 185-205, 2005.
  9. X. Liu, A. Krishnan, and A. Mondry, "An Entropy-Based Gene Selection Method for Cancer Classification Using Microarray Data," BMC Bioinformatics, vol. 6, article 76, 2005.
  10. Golub,T. et al. (1999) Molecular classi?cation of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286, 531–537.
  11. Piyushkumar A. Mundra, Jagath C. Rajapakse," F-score with Pareto Front Analysis for Multiclass Gene Selection , Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics Lecture Notes in Computer Science Volume 5483, 2009, pp 56-67.
  12. Q. Shen, W. -M. Shi, and W. Kong, "New Gene Selection Method for Multiclass Tumor Classification by Class Centroid," J. Biomedical Informatics, vol. 42, no. 1, pp. 59-65, 2009.
  13. Y. -S. Tsai, C. -T. Lin, G. Tseng, I. -F. Chung, and N. Pal, "Discovery of Dominant and Dormant Genes from Expression Data Using a Novel Generalization of SNR for Multi-Class Problems," BMC Bioinformatics, vol. 9, article 425, 2008.
  14. K. Kadota, Y. Nakai, and K. Shimizu, "A Weighted Average Difference Method for Detecting Differentially Expressed Genes from Microarray Data," BMC Bioinformatics, vol. 3, article 8, 2008.
  15. C. Ooi, M. Chetty, and S. Teng, "Differential Prioritization between Relevance and Redundancy in Correlation-Based Feature Selection Techniques for Multiclass Gene Expression Data," BMC Bioinformatics, vol. 7, article 320, 2006.
  16. Jagath C. Rajapakse and Piyushkumar A. Mundra," Multiclass Gene Selection Using Pareto-Fronts", IEEE/ACM Transactions on Computational Biology and Bioinformatics, VOL. 10, NO. 1, January/February 2013.
Index Terms

Computer Science
Information Sciences

Keywords

Aggregation statistics filter methods gene selection multiobjective evolutionary optimization Pareto-front analysis.