CFP last date
20 December 2024
Reseach Article

Article:A Novel SVM based CSSFFS Feature Selection Algorithm for Detecting Breast Cancer

by S. Aruna, Dr S.P. Rajagopalan
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 31 - Number 8
Year of Publication: 2011
Authors: S. Aruna, Dr S.P. Rajagopalan
10.5120/3844-5346

S. Aruna, Dr S.P. Rajagopalan . Article:A Novel SVM based CSSFFS Feature Selection Algorithm for Detecting Breast Cancer. International Journal of Computer Applications. 31, 8 ( October 2011), 14-20. DOI=10.5120/3844-5346

@article{ 10.5120/3844-5346,
author = { S. Aruna, Dr S.P. Rajagopalan },
title = { Article:A Novel SVM based CSSFFS Feature Selection Algorithm for Detecting Breast Cancer },
journal = { International Journal of Computer Applications },
issue_date = { October 2011 },
volume = { 31 },
number = { 8 },
month = { October },
year = { 2011 },
issn = { 0975-8887 },
pages = { 14-20 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume31/number8/3844-5346/ },
doi = { 10.5120/3844-5346 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:17:35.366361+05:30
%A S. Aruna
%A Dr S.P. Rajagopalan
%T Article:A Novel SVM based CSSFFS Feature Selection Algorithm for Detecting Breast Cancer
%J International Journal of Computer Applications
%@ 0975-8887
%V 31
%N 8
%P 14-20
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this paper a feature selection algorithm CSSFFS (Constrained search sequential floating forward search) based on SVM is proposed for detecting breast cancer. It is a greedy algorithm with search strategy of constrained search. The aim of this algorithm is to achieve a feature subset with minimal BER (Balanced error rate). This is a hybrid algorithm with the combination of filters and wrappers. Feature ranking with SVM acts as filters for removing irrelevant features. Then SFFS acts as wrapper which further removes the redundant features yielding the optimal subset of features. WDBC dataset from UCI machine learning depository is used for the experiment. The experiments are conducted in WEKA. After feature selection the accuracy and BER for WDBC dataset is 98.2425 and 0.0226 respectively with 15 features.

References
  1. Harirchi, et al., “Breast cancer in Iran: a review of 903 case records,” Public Health, 2000. 114(2): p. 143-145.
  2. T. Subashini, V. Ramalingam, and S. Palanivel, “Breast mass classification based on cytological patterns using RBFNN and SVM,” Expert Systems with Applications, 2009. 36(3): p. 5284-5290
  3. Y. liu and Y. F. Zheng, (2006) "FS_SFS: A novel feature selection method for support vector machines", Pattern Recognition, vol. 39, pp. 1333–1345.
  4. R. Kohavi, and G.H. John, “Wrappers for feature subset selection”,Artificial intelligence, 1997. 97(1-2): p. 273-324.
  5. Y. Yuling, “A Feature Selection Method for Online Hybrid Data Based on Fuzzy-rough Techniques,” 2009: IEEE.
  6. N. Abe, et al., “A divergence criterion for classifier-independent feature selection,” Advances in Pattern Recognition, 2000: p. 668-676.
  7. R. Jensen, and Q. Shen, “Computational intelligence and feature selection: rough and fuzzy approaches,” IEEE Press Series On Computational Intelligence, 2008: p. 340.
  8. I. Guyon, and A. Elisseeff, “An introduction to variable and feature selection,” The Journal of Machine Learning Research, 2003. 3: p. 1157-1182.
  9. Vapnik,V.N. (1998) Statistical Learning Theory. Adaptive and Learning Systems for Signal Processing, Communications, and Control. Wiley, New York.
  10. Zien,A., Rätch,G., Mika,S., Schölkopf,B., Lengauer,T. and Müller,K.-R. (2000) Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 16,799–807.
  11. Ding.C and Dubchak;I (2001) Multi class protein fold recognition using support vector machines and neural networks. Bioinformatics, 17, 349-358.
  12. Guyon,I., Weston,J., Barnhill,S. and Vapnik,V. (2001) Gene selection for cancer classification using support vector machines. Machine Learning, 46, 389–422.
  13. Furey,T.S., Cristianini,N., Duffy,N., Bednarski,D.W., Schummer,M. and Haussler,D. (2001) Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16, 906–914.
  14. Bock,J.R. and Gough,D.A. (2001) Predicting protein–protein interactions from primary structure. Bioinformatics, 17, 455–460.
  15. Anderson,D.C., Li,W., Payan,D.G. and Noble,W.S. (2003) A new algorithm for the evaluation of shotgun peptide sequencing in proteomics: support vector machine classification of peptide MS/MS spectra and SEQUEST scores. J. Proteome Res., 2, 137–146
  16. Platt J: Fast Training of Support Vector Machines using Sequential Minimal Optimization. Advances in Kernel Methods- Support Vector Learning 1998.
  17. Weka: A multi-task machine learning software developed by Waikato University. 2006, [http://www.cs.waikato.ac.nz/ml/weka].
  18. P. Pudil, J. Novovicova, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognition Lett., vol. 15, pp. 1119-1125, 1994.
  19. P. Somol, P. Pudil, J. Novoviˇcová, and P. Paclík, “Adaptive floating search methods in feature selection,” Pattern Recognition Letters, vol. 20, pp. 1157–1163, 1999.
  20. UCI Machine learning depository, (http://archive.ics.uci.edu/ml).
  21. P. Baldi, S. Brunak, Y. Chauvin, et al. (2000) “Assessing the accuracy of prediction algorithms for classification and overview”, Bioinformatics, 5(5):412–424.
  22. J.R. Quinlan, Improved use of continuous attributes in C4.5, Journal of Artificial Intelligence Research 4 (1996) 77–90
  23. B. Ster, A. Dobnikar, Neural networks in medical diagnosis: comparison with other methods, in: Proceedings of the International Conference on Engineering Applications of Neural Networks (EANN’96), 1996, pp. 427–430.
  24. Shang N, Breiman L (1996) Distribution based trees are more accurate. In: Proceedings of the International conference on neural information processing, Hong Kong, vol 1, pp133–138
  25. H.J. Hamilton, N. Shan, N.Cercone (1996) RIAC: a rule induction algorithm based on approximate classification. Technical report CS 96-06, Regina University
  26. Adamczak R, Duch W (1997) New developments in the feature space mapping model. In: 3rd Conference on neural networks and their applications, October 14–18
  27. K.P. Bennet, J.A. Blue, A Support Vector Machine Approach to Decision Trees, Math Report, Rensselaer Polytechnic Institute, Troy, 1997, pp. 97–100.
  28. Jankowski N, Kadirkamanathan V (1997) Statistical control of RBF-like networks for classification. In: Proceedings of the 7th international conference on artificial neural networks, Lausanne, pp 385–390
  29. D. Nauck, R. Kruse, Obtaining interpretable fuzzy classification rules from medical data, Artificial Intelligence in Medicine 16 (1999) 149–169.
  30. C.-A.Pena-Reyes, M. Sipper, A fuzzy-genetic approach to breast cancer diagnosis, Artificial Intelligence in Medicine 17 (1999) 131–155.
  31. Duch W, Adamczak R, Grabczewski K (2000) A new methodology of extraction, optimization and application of crisp and fuzzy logical rules. IEEE Trans Neural Netw11(2):1-31
  32. Abbass HA, Towsey M, Finn GD (2001) C-net, a method for generating non-deterministic and dynamic multivariate decision trees. Knowl Inf Syst 3:184–197
  33. Abbass HA (2002) An evolutionary artificial neural networks approach for breast cancer diagnosis. Artif Intell Med 25(3):265–281
  34. D.E. Goodman, L. Boggess, A. Watkins, Artificial immune system classification of multiple-class problems, in: Proceedings of the Artificial Neural Networks in Engineering ANNIE, 2002, pp. 179–183.
  35. János Abonyi, Ferenc Szeifert, Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters 24 (14) (2003) 2195–2207.
  36. Liu B, Abbass HA, McKay B (2004) Classification rule discovery with ant colony optimisation. IEEE Comput Intell Bull3(1): 31-35
  37. Ioannis Anagnostopoulos E, Ilias Maglogiannis(2006), Neural network-based diagnostic and prognostic estimations in breast cancer microscopic instances, Med Bio Eng Comput 44:773–784 DOI 10.1007/s11517-006-0079-4
  38. A. K. Jain, Pavan K. Mallapragada, Martin Law, "Bayesian Feedback in Data Clustering," ICPR, vol. 3, pp.374-378, 18th International Conference on Pattern Recognition (ICPR'06) Volume 3, 2006
  39. Chung-Jui Tu, Li-Yeh Chuang, Jun-Yang Chang, and Cheng-Hong Yang, Member, IAENG , IAENG International Journal of Computer Science, 33:1, IJCS_33_1_18
  40. Hua-Liang Wei and Stephen A. Billings Ieee Transactions On Pattern Analysis And Machine Intelligence, Vol. 29, No. 1, January 2007
  41. Ilias Maglogiannis • Elias Zafiropoulos Ioannis Anagnostopoulos ,An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers, Appl Intell (2009) 30: 24–36
  42. I. Gadaras, L. Mikhailov, An interpretable fuzzy rule-based classification methodology for medical diagnosis, Artificial Intelligence in Medicine 47 (1) (2009) 25–41.
  43. Fernando E. B. Otero, Alex A. Freitas, and Colin G. Johnson, Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009, part of the IEEE Symposium Series on Computational Intelligence 2009, Nashville, TN, USA, March 30, 2009 – April 2, 2009. IEEE 2009
  44. Chin-Yuan Fana, Pei-Chann Changb, Jyun-Jie Linb, J.C. Hsiehb, A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification, Applied Soft Computing 11 (2011) 632–644
  45. Li-Yeh Chuang, Sheng-Wei Tsai, Cheng-Hong Yang(2011), Catfish Binary Particle Swarm Optimization for Feature Selection, In: Proceedings of the international Conference on Machine Learning and Computing IPCSIT vol.3 (2011)pp 40-44
  46. Mohammad Darzi, Ali AsgharLiaei,Mahdi Hosseini, HabibollahAsghari , Feature Selection for Breast Cancer Diagnosis:A Case-Based Wrapper Approach ,(2011), World academy of Science, Engineering and Technology 77, 2011,pp 1142-1143.
Index Terms

Computer Science
Information Sciences

Keywords

SVM breast cancer diagnosis feature selection SFFS BER