CFP last date
20 January 2025
Reseach Article

Two Step Feature Extraction Method for Microarray Cancer Data using Support Vector Machines

by C. Arunkumar, S. Ramakrishnan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 85 - Number 8
Year of Publication: 2014
Authors: C. Arunkumar, S. Ramakrishnan
10.5120/14864-3239

C. Arunkumar, S. Ramakrishnan . Two Step Feature Extraction Method for Microarray Cancer Data using Support Vector Machines. International Journal of Computer Applications. 85, 8 ( January 2014), 34-42. DOI=10.5120/14864-3239

@article{ 10.5120/14864-3239,
author = { C. Arunkumar, S. Ramakrishnan },
title = { Two Step Feature Extraction Method for Microarray Cancer Data using Support Vector Machines },
journal = { International Journal of Computer Applications },
issue_date = { January 2014 },
volume = { 85 },
number = { 8 },
month = { January },
year = { 2014 },
issn = { 0975-8887 },
pages = { 34-42 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume85/number8/14864-3239/ },
doi = { 10.5120/14864-3239 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:01:58.356929+05:30
%A C. Arunkumar
%A S. Ramakrishnan
%T Two Step Feature Extraction Method for Microarray Cancer Data using Support Vector Machines
%J International Journal of Computer Applications
%@ 0975-8887
%V 85
%N 8
%P 34-42
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Diagnosis of cancer is one of the most emerging clinical applications in microarray gene expression data. However, cancer classification on microarray gene expression data still remains a difficult problem. The main reason for this is the significantly large number of genes present relatively compared to the number of available training samples. In this paper, a novel approach to feature extraction combining the statistical t-test and absolute scoring is proposed for achieving better classification rate. Suitable classification approaches using the linear Support Vector Machines, the Proximal Support Vector Machines and the Newton Support Vector Machines is also discussed. A comparative analysis on the different techniques for feature extraction is also presented. Microarray cancer data based on Adenoma and Carcinoma with 7086 and 7457 genes of 4 and 18 patients respectively is used for this study. Increase in the classification rate of the proposed new method is clearly demonstrated in the results.

References
  1. J. P. Florido, H. Pomares, I. Rojas, J. M. Urquiza, L. J. Herrera, M. G. Claros, "Effect of Pre-processing methods on Microarray-based SVM classifiers in Affymetrix Genechips", International Joint Conference on Neural Networks(IJCNN), pp 1-6, 2010
  2. Wei Du, Yan Wang,De-Ping Wang, Zhong-Bo Cao, Ying Sun and Yan-Chun Liang, "An Improved Normalized Signal to Noise Ratio Method for Irrelevant Genes Removing", 3rd International Conference on Biomedical Engineering and Informatics (BMEI 2010), pp 2275-2279, 2010
  3. Azadeh Mohammadi, Mohammad Hossein Saraee, "Dealing with Missing Values in Microarray Data, International Conference on Emerging Technologies" ,IEEE-ICET 2008,Rawalpindi, Pakistan, pp 258-263, 18-19 October, 2008
  4. Nicholas A. Furlotte, Lijing Xu,Robert W. Williams, Ramin Homayouni, "Literature-based Evaluation of Microarray Normalization", IEEE International Conference on Bioinformatics and Biomedicine, pp 608-612, 2011
  5. http://genomics-pubs. princeton. edu/oncology/
  6. Jinn-Yi Yeh, Tai-Shi Wu, Min-Che Wu, Der-Ming Chang, "Applying Data Mining Techniques for Cancer Classification from Gene Expression Data", International Conference on Convergence Information Technology, IEEE Computer Society,pp 703-708, 2007
  7. Huang, D. ; Chow, T. W. S. ; Ma, E. W. M. ; Jinyan Li, Efficient selection of discriminative genes from microarray gene expression data for cancer diagnosis, IEEE Transactions on Circuits and Systems, Volume 52, Issue 9, pp 1909-1918, 2005
  8. Seeja. K. R, Shweta, "Microarray Data Classification Using Support Vector Machine", International Journal of Biometrics and Bioinformatics (IJBB), Volume (5): Issue (1): 2011
  9. Tang, Yuchun; Zhang, Yan-Qing; Huang, Zhen, Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol 4, Issue 3, pp 365-381, 2007
  10. http://bioinfo. mbb. yale. edu/mbb452a/intro
  11. Ghorai, S. ; Mukherjee, A. ; Sengupta, S. ; Dutta, P. K. , Cancer Classification from Gene Expression Data by NPPC Ensemble, IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol 8, Issue 3, pp 659-671, 2011
  12. Guang-bin Huang; Hongming Zhou; Xiaojian Ding; Rui Zhang, Extreme Learning Machine for Regression and Multiclass Classification, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol 42, Issue 2, pp 513-529, 2012
  13. Glenn Fung & O. L. Mangasarian, "Finite Newton Method for Lagrangian Support Vector Machine Classification"
  14. Xiong Fu-song, NLSSVM: "Least Square Support Vector Machine based on Newton optimization", IEEE International Conference on Computer Science and Automation Engineering (CSAE), pp 140-144, 2011
  15. Ruopeng Wang; Hongmin Xu; Hong Shi, "Newton's Method for L? Support Vector Machine Via Smoothing technique", Sixth International Conference on Natural Computation, Vol 1, pp 436-440, 2010
  16. Mitra, S. ; Das, R. ; Hayashi, Y. , "Genetic Networks and Soft Computing", IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol 8, Issue 1, pp 94-107, 2011
  17. Alireza Osareh,Bita Shadgar, "Microarray Data Analysis for Cancer Classification", 5th International Symposium on Health Informatics and Bioinformatics, Turkey, pp 125-132, April 20-22, 2010
  18. Jinn-Yi Yeh,Tai-Shi Wu,Min-Che Wu,Der-Ming Chang, "Applying Data Mining Techniques for Cancer Classification from Gene Expression Data", International Conference on Convergence Information Technology, IEEE Computer Society, pp 703-708, 2007
  19. Chen Liao,Shutao Li,Zhiyuan Luo, "Gene Selection for Cancer Classification using Wilcoxon Rank Sum Test and Support Vector Machine", International Conference on Computational Intelligence, pp 368-373, 2006
  20. Yuchun Tang, Yan-Qing Zhang, and Zhen Huang, "Development of Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis", IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 4, NO. 3, Published by the IEEE CS, CI, and EMB Societies & the ACM, pp 365-381, July-September 2007
  21. Patharawut Saengsiri,Sageemas Na Wichian,Phayung Meesad,Unger Herwig, "Comparison of Hybrid Feature Selection Models on Gene Expression Data", Eighth International Conference on ICT and Knowledge Engineering, pp 13-18, 2010
  22. Wai-Ho Au, Keith C. C. Chan, Andrew K. C. Wong, and Yang Wang, "Attribute Clustering for Grouping, Selection, and Classification of Gene Expression Data", IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 2, NO. 2, pp 83-101, April-June 2005
  23. John Phan, Richard Moffitt, Jennifer Dale, John Petros, Andrew Young, and May Wang, "Improvement of SVM Algorithm for Microarray Analysis Using Intelligent Parameter Selection", Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, pp 4838-4841, September 1-4, 2005
  24. Osman Abul, Reda Alhajj, and Faruk Polat, "A Powerful Approach for Effective Finding of Significantly Differentially Expressed Genes", IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 3, NO. 3, pp 220-231, July-September 2006
  25. Shutao Li, Chen Liao, James T. Kwok, "Wavelet-Based Feature Extraction for Microarray Data Classification", International Joint Conference on Neural Networks,Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada , pp 5028-5033, July 16-21, 2006
  26. Taysir Hassan A. Soliman,Adel A. Sewissy,Hisham AbdelLatif, "A Gene Selection Approach for Classifying Diseases Based on Microarray Datasets", 2nd International Conference on Computer Technology and Development (lCCTD 2010), pp 626-631, 2010
  27. http://bmbolstad. com/talks/Bolstad%20GenentechTalk. pdf
  28. http://www. bioinformatics. wsu. edu/bioinfo_course/notes/Lecture16. pdf
  29. T. R. Golub et al. , "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, pp. 531-537, 1999.
  30. A. Blum and P. Langley, "Selection of relevant features and examples in machine learning," Artif. Intell. , vol. 97, pp. 245–271, 1997.
Index Terms

Computer Science
Information Sciences

Keywords

Normalization Linear SVM Proximal SVM Newton SVM absolute scoring