CFP last date
20 January 2025
Reseach Article

Tumor Clustering and Gene Selection Techniques - A Survey

by S. Praba, A. K. Santra
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 57 - Number 2
Year of Publication: 2012
Authors: S. Praba, A. K. Santra
10.5120/9083-2570

S. Praba, A. K. Santra . Tumor Clustering and Gene Selection Techniques - A Survey. International Journal of Computer Applications. 57, 2 ( November 2012), 1-8. DOI=10.5120/9083-2570

@article{ 10.5120/9083-2570,
author = { S. Praba, A. K. Santra },
title = { Tumor Clustering and Gene Selection Techniques - A Survey },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 57 },
number = { 2 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume57/number2/9083-2570/ },
doi = { 10.5120/9083-2570 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:59:22.665427+05:30
%A S. Praba
%A A. K. Santra
%T Tumor Clustering and Gene Selection Techniques - A Survey
%J International Journal of Computer Applications
%@ 0975-8887
%V 57
%N 2
%P 1-8
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Cancer classification has become one of the active areas of research in the field of medical sciences. Various gene selection and tumor classification techniques have been available in the literature. Gene selection comprises of an exploration for gene subsets that are capable to discriminate tumor tissue from normal tissue. Gene selection is a primary issue in gene expression based tumor classification. Recently, Tissue microarrays have become an extensively used technique to screen for protein expression patterns in a large numbers of tumors. There is increasing interest in transforming the importance of tumor classification from morphologic to molecular. Gene expression profiles provide additional data when compared with morphology and offer a substitute to morphology-based tumor classification systems. So, researchers are very much intentional to develop novel approaches for gene selection and tumor classification. This paper provides a detailed related survey of various gene selection techniques and tumor classification approaches.

References
  1. Tao Shi, David Seligson, Arie S Belldegrun, Aarno Palotie and Steve Horvath, "Tumor classification by tissue microarray profiling: random forest clustering applied to renal cell carcinoma", Modern Pathology Vol. 18, pages: 547–557, 2005.
  2. Zhong-Yuan Zhang† Xiang-Sun Zhang, "Two Improvements of NMF Used for Tumor Clustering", First International Symposium on Optimization and Systems Biology (OSB'07) Beijing, China, 2007.
  3. Xiong M, Li W, Zhao J, Jin L, Boerwinkle E. , "Feature (gene) selection in gene expression-based tumor classification", Molecular Genetics and Metabolism, Volume: 73, Issue: 3, Pages: 239-247, 2001.
  4. Tlsty TD, Margolin BH, Lum K. "Differences in the rates of gene amplification in nontumorigenic and tumorigenic cell lines as measured by Luria-Delbruck fluctuation analysis", Proc Natl Acad Sci USA 86:9441–9445, 1989.
  5. Theillet C. "Full speed ahead for tumor screening", Nature Med 4:767–768, 1998.
  6. Strausberg RL, Austin MJF. "Functional genomics: Technological challenges and opportunities", Physiol Genomics 1:25–32, 1999.
  7. Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96:6745–6750, 1999.
  8. Perou CM, Jeffrey SS, Rijn MVD, Rees CA, Eisen MB, Ross RT, Pergamenschikov A, Williams CF, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci USA 96:9212–9217, 1999.
  9. Isabelle Guyon, Jason Weston and Stephen Barnhill, "Gene Selection for Cancer Classification using Support Vector Machines", Machine Learning, 46, 389–422, 2002.
  10. Theodoridis S. Pattern Recognition. San Diego: Academic Press, 1999.
  11. S. Dudoit, J. Fridlyand, T. Speed, "Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data", J. Am. Statistical Assoc. , Vol. 97, pp. 77-87, 2002.
  12. C. Peterson, M. Ringner, "Analysis Tumor Gene Expression Profiles", Artificial Intelligence in Medicine, Vol. 28, no. 1, pp. 59-74, 2003.
  13. M. Eisen, P. Spellman, P. Brown, D. Botstein, "Cluster Analysis and Display of Genome- Wide Expression Patterns", Proc. Nat'l Acad. Sci. USA, Vol. 95, pp. 14863-14868, 1998.
  14. Pablo Tamayo, Donna Slonim, Jill Mesirov, Qing Zhu, Sutisak Kitareewan, Ethan Dmitrovsky, Eric S. Lander and Todd R. Golub, "Interpreting Patterns of Gene Expression with Self-Organizing Maps: Methods and Application to Hematopoietic Differentiation", Proc. Nat'l Acad. Sci. USA, Vol. 96, pp. 2907-2912, 1999.
  15. K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung, "Gene Selection for Tumor Classification Using Microarray Gene Expression Data", Proceedings of the World Congress on Engineering 2007 Vol I, WCE 2007, July 2 - 4, 2007.
  16. V. Cherkassy, "Model complexity control and statistical learning theory", Journal of natural computing 1: (2002) 109–133.
  17. N. Cristianini, J. S. Taylor, "Support Vector Machines and Other Kernel-based Learning Algorithms", Cambridge, UK: Cambridge University Press, 2000.
  18. Inza, I. , Larranaga, P. , Blanco, R. and Cerrolaza, A. J. , "Filter versus wrapper gene approaches in DNA microarray domains", Artificial Intelligence in Medicine, ELSEVIER, Amsterdam, 2004, 31(2), pp. 91- 103.
  19. C. H. Ooi and P. Tan, "Genetic algorithms applied to multi-class prediction for the analysis of gene expression data," Bioinformatics, Oxford University Press, Oxford, 2003, 19(1), pp. 37-44.
  20. Li, J. , Zhang, C. and Olihara, M. , "A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression", Bioinformatics, Oxford University Press, Oxford, 2004, 20(15), pp. 2429-2437.
  21. Jaeger, J. , Sengupta, R. and Ruzzo, W. L. , "Improved gene selection for classification of microarrays", Pac. Symp. Biocomput, Hawaii, USA, 2003, pp. 53-64.
  22. Li Jiangeng, Duan Yanhua and Ruan Xiaogang, "A Novel Hybrid Approach to Selecting Marker Genes for Cancer Classification Using Gene Expression Data", IEEE, 2007.
  23. D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788–791, October 1999.
  24. Daniel D. Lee and Sebastian H. Seung. Algorithms for non-negative matrix factorization. In Annual Conference on Neural Information Processing Systems, pages 556–562, 2000.
  25. C. Ding, X. He, and H. D. Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. In SIAM Data Mining Conf, pages 606–610, 2005.
  26. Zhong-Yuan Zhang, "NMF-based Models for Tumor Clustering: A Systematic Comparison", Third International Symposium on Optimization and Systems Biology (OSB'09), pp. 41–47, 2009.
  27. Thomas Hofmann, "Probabilistic latent semantic indexing", In SIGIR '99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 50–57. ACM Press, 1999.
  28. C. Ding, T. Li, and W. Peng. "Nonnegative matrix factorization and probabilistic latent semantic indexing: Equivalence chi-square statistic and a hybrid method", Proceedings of the National Conference on Artificial Intelligence, 21(1):342, 2006.
  29. A. Pascual-Montano, J. M. Carazo, K. Kochi, D. Lehmann, and R. D. Pascual-Marqui. Nonsmooth nonnegative matrix factorization (nsnmf). IEEE transactions on Pattern Analysis and Machine Intelligence, 28(3):403–415, March 2006.
  30. Chris Ding, Tao Li, Dijun Luo, andWei Peng. Posterior probabilistic clustering using nmf. In SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, pages 831–832, New York, NY, USA, 2008. ACM.
  31. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine Learning, vol. 46, no. 1-3, pp. 389–422, 2002. [Online]. Available: citeseer. ist. psu. edu/guyon02gene. html.
  32. Enrique Alba, Jose Garcia-Nieto, Laetitia Jourdan and El-Ghazali Talbi, "Gene Selection in Cancer Classification using PSO/SVM and GA/SVM Hybrid Algorithms", IEEE Congress on Evolutionary Computation (CEC), 2007.
  33. J. Kennedy and R. Eberhart, "Particle Swarm Optimization," in Proc. of the IEEE International Conference on Neural Networks, vol. 4, 1995, pp. 1942–1948.
  34. A. Moraglio, C. D. Chio, and R. Poli, "Geometric Particle Swarm Optimization," in 10th European conference on Genetic Programming (EuroGP 2007), ser. Lecture Notes in Computer Science, vol. 4445. Springer, Abril 2007.
  35. L. Jourdan, C. Dhaenens, and E. -G. Talbi, "A genetic algorithm for feature selection in data-mining for genetics," in Proceedings of the 4th Metaheuristics International ConferencePorto (MIC'2001), Porto, Portugal, 2001, pp. 29–34.
  36. Kitter J, "Feature selection and extraction", In: Young TY, Fu K-S (eds) Handbook of pattern recognition and image processing. Academic Press, NY
  37. Bae K, Mallick BK (2004), "Gene selection using a two-level hierarchical Bayesian model", Bioinformatics 20:3423–3430.
  38. Li W, Sun F, Grosse I (2004) Extreme value distribution based gene selection criteria for discriminant microarray data analysis using logistic regression. J Comput Biol 1:215–226.
  39. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537.
  40. Devore J, Peck R (1997) Statistics: the exploration and analysis of data, 3rd edn. Duxbury Press, Pacific Grove, CA.
  41. Thomas G et al (2001) An efficient and robust statistical modeling approach to discover differentially expressed genes using genomic expression profiles. Genome Res 11:1227–1236.
  42. Troyanskaya G et al (2002) "Nonparametric methods for identifying differentially expressed genes in microarray data", Bioinformatics 18:1454–1461.
  43. Lee KE, Sha N, Dougherty ER, Vannucci M, Mallick BK (2003) Gene selection: a Bayesian variable selection approach. Bioinformatics 19:90–97.
  44. Shevade SK, Keerthi S (2003) A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics 19:2246–2253.
  45. Draghici S, Kulaeva O, Hoff B, Petrov A, Shams S, Tainsky MA (2003) Sorin noise sample method: an ANOVA approach allowing robust selection of differentially regulated genes measured by DNA microarrays. Bioinformatics 19:1348–1359.
  46. Calo´ DG, Galibemberti G, Pillati M, Viroli C (2005) Variable selection in cell classification problems: a strategy based on independent component analysis. In: Vichi M, Monari P, Mignani S, Montanari A (eds) New development in classification and data analysis. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 21–30.
  47. Zhang HH, Ahn J, Lin X, Park C (2006) Gene selection using support vector machines with non-convex penalty. Bioinformatics 22:88–95.
  48. Hu QH, Yu DR, Liu JF, Wu CX (2008) Neighborhood rough set based heterogeneous feature subset selection. Info Sci 178(18): 3577–3594.
  49. Hu QH, Yu DR, Xie ZX (2008) Neighborhood classifiers. Expert Syst Appl 34(2):866–876.
  50. Lagus K, Alhoniemi E, Valpola H (2001) Independent variable group analysis. In: Dorffner G, Bischof H, Hornik K (eds) International conference on artificial neural networks—ICANN 2001, ser. LLNCS, vol 2130. Springer, Vienna, Austria. August, pp 203–210
  51. Lagus K, Alhoniemi E, Seppa¨ J, Honkela A, Wagner P (2005) Independent variable group analysis in learning compact representations for data. In: Honkela T, Ko¨no¨nen V, Po¨lla¨ M, Simula O (eds) Proceedings of the international and interdisciplinary conference on adaptive knowledge representation and reasoning (AKRR'05). Espoo, Finland, June, pp 49–56.
  52. Esa Alhoniemi, Antti Honkela, Krista Lagus, Jeremias Seppa, Paul Wagner, and Harri Valpola, "Compact Modeling of Data Using Independent Variable Group Analysis", IEEE Transactions on Neural Networks, 2007.
  53. Chun-Hou Zheng, Yan-Wen Chong and Hong-Qiang Wang, "Gene selection using independent variable group analysis for tumor classification", Neural Comput & Applic, 2011.
  54. Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209.
  55. Rui Xu; Anagnostopoulos, G. C. ; Wunsch, D. C. I. I. , "Multiclass Cancer Classification Using Semisupervised Ellipsoid ARTMAP and Particle Swarm Optimization with Gene Expression Data", IEEE/ACM Transactions on Computational Biology and Bioinformatics, Volume:4, Issue: 1, Page(s): 65- 77, 2007.
  56. Wang, Shulin; Chen, Huowang; Li, Shutao, "Gene Selection Using Neighborhood Rough Set from Gene Expression Profiles", International Conference on Computational Intelligence and Security, Page(s): 959- 963, 2007.
  57. Kai-Bo Duan; Rajapakse, J. C. ; Haiying Wang; Azuaje, F. , "Multiple SVM-RFE for gene selection in cancer classification with expression data", IEEE Transactions on Nano Bioscience, Volume: 4, Issue: 3, Page(s): 228 - 234, 2005.
  58. Guoli Ji; Zijiang Yang; Wenjie You, "PLS-Based Gene Selection and Identification of Tumor-Specific Genes", IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, Volume: 41, Issue: 6, Page(s): 830- 841, 2011.
  59. Shutao Li; Mingkui Tan, "Gene Selection and Tissue Classification Based on Support Vector Machine and Genetic Algorithm", 1st International Conference on Bioinformatics and Biomedical Engineering (ICBBE), 2007.
  60. Garcia-Nieto, J. ; Alba, E. ; Apolloni, J. , "Hybrid DE-SVM Approach for Feature Selection: Application to Gene Expression Datasets", 2nd International Logistics and Industrial Informatics, (LINDI), Page(s): 1 – 6, 2009.
  61. Saraswathi, S. ; Sundaram, S. ; Sundararajan, N. ; Zimmermann, M. ; Nilsen-Hamilton, M. , "ICGA-PSO-ELM Approach for Accurate Multiclass Cancer Classification Resulting in Reduced Gene Sets in Which Genes Encoding Secreted Proteins Are Highly Represented", IEEE/ACM Transactions on Computational Biology and Bioinformatics, Volume:8, Issue: 2, Page(s): 452- 463, 2011.
  62. J. Quackenbush, "Computational Analysis of Microarray Data", Nature Rev. Genteics, Vol. 2, pp. 418-427, 2001.
  63. Srinivas Mukkamala, Qingzhong Liu, Rajeev Veeraghattam, Andrew H. Sung, "Computational Intelligent Techniques for Tumor Classification (Using Microarray Gene Expression Data)", International Journal of Lateral Computing, Vol. 2 (2005), pp. 38-45.
  64. Heping Zhang, Chang-Yung Yu, Burton Singer and Momiao Xiong, "Recursive partitioning for tumor classification with gene expression microarray data", proceedings of National Academy of Sciences of the United States of America, 2001.
  65. Chun-Hou Zheng; De-Shuang Huang; Lei Zhang; Xiang-Zhen Kong, "Tumor Clustering Using Nonnegative Matrix Factorization With Gene Selection", IEEE Transactions on Information Technology in Biomedicine, Volume: 13, Issue:4 , Page(s): 599- 607, 2009.
  66. Chun-Hou Zheng; Juan Wang; To-Yee Ng; Chi Keung Shiu, "Tumor Clustering Based on Penalized Matrix Decomposition", 4th International Conference on Bioinformatics and Biomedical Engineering (iCBBE), 2010 .
  67. Nguyen Minh Phuong; Nguyen Xuan Vinh, "Normalized EM algorithm for tumor clustering using gene expression data", 8th IEEE International Conference on BioInformatics and BioEngineering, Page(s): 1- 7, 2008
Index Terms

Computer Science
Information Sciences

Keywords

Gene selection Tumor Classification DNA Genetic Algorithm Particle Swarm Intelligence