CFP last date
20 January 2025
Reseach Article

A New Unsupervised Clustering-based Feature Extraction Method

by Sabra El Ferchichi, Salah Zidi, Kaouther Laabidi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 57 - Number 6
Year of Publication: 2012
Authors: Sabra El Ferchichi, Salah Zidi, Kaouther Laabidi
10.5120/9122-3255

Sabra El Ferchichi, Salah Zidi, Kaouther Laabidi . A New Unsupervised Clustering-based Feature Extraction Method. International Journal of Computer Applications. 57, 6 ( November 2012), 43-49. DOI=10.5120/9122-3255

@article{ 10.5120/9122-3255,
author = { Sabra El Ferchichi, Salah Zidi, Kaouther Laabidi },
title = { A New Unsupervised Clustering-based Feature Extraction Method },
journal = { International Journal of Computer Applications },
issue_date = { November 2012 },
volume = { 57 },
number = { 6 },
month = { November },
year = { 2012 },
issn = { 0975-8887 },
pages = { 43-49 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume57/number6/9122-3255/ },
doi = { 10.5120/9122-3255 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:59:46.653883+05:30
%A Sabra El Ferchichi
%A Salah Zidi
%A Kaouther Laabidi
%T A New Unsupervised Clustering-based Feature Extraction Method
%J International Journal of Computer Applications
%@ 0975-8887
%V 57
%N 6
%P 43-49
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In manipulating data such as in supervised or unsupervised learning, we need to extract new features from the original features for the purpose of reducing the dimension of feature space and achieving better performance. In this paper, we investigate a novel schema for unsupervised feature extraction for classification problems. We based our method on clustering to achieve feature extraction. A new similarity measure based on trend analysis is devised to identify redundant information in the data. Clustering is then performed on the feature space. Once groups of similar features are formed, linear transformation is realized to extract a new set of features. The simulation results on classification problems for experimental data sets from UCI machine learning repository and face recognition problem show that the proposed method is effective in almost cases when compared to conventional unsupervised methods like PCA and ICA.

References
  1. Hild II K. E. , Erdogmus D. , Torkkola K. and Principe J. C. 2006. Feature extraction using information-theoretic learning. IEEE Trans. on pattern analysis and machine intelligence, 28 (September 2006), 1385-1391.
  2. Ripley B. D, Pattern recognition and neural networks. In Cambrigde Univ. Press 1995.
  3. Guyon I. , Elisseeff A. 2003. An introduction to variable and feature selection. J. Mach. Learn. 3(March 2003), 1157-1182.
  4. Torkkola K. 2003. Feature extraction by non-parametric mutual information maximization. J. Mach. Learn. 3 (2003), 1415-1438.
  5. Liu X. , Tang J. , Liu J. , Feng Z. , "A semi-supervised relief based feature extraction algorithm", in second international conference on future generation communication and networking symposia, 2008.
  6. Saul L. K. , Weinberger K. Q. , Sha F. , Ham J. and Lee D. D. Spectral Methods for Dimensionality Reduction. in O. Chapelle, B. Schoelkopf, and A. Zien (eds. ), Semi supervised Learning, MIT Press. Cambridge, MA 2006.
  7. Kwak N. and Chong-Ho C. 2003. "Feature extraction based on ICA for binary classification problems". IEEE Trans. on knowledge and data engineering, 15 (November 2003), 1387-1388.
  8. S. Noam and T. Naftali 2001. "The power of word clusters for text classification", 23rd European colloquium on information retrieval research.
  9. Baker, L. D. and McCallum A. K. , 1998. "Distributional clustering of words for text classification", 21st Annual international ACM SIGIR conference on research and development in information retrieval.
  10. Bonet I. , Sayeys Yvan, Grau Abalo R. , Garcia M. M, Sanchez R. , and Van de Peer Y. 2006. Feature extraction using clustering of protein. In proceedings of the 11th Iberoamerican congress in pattern recognition CIARP, 614-623.
  11. Fern X. Z. , Brodeley C. E. 2004 Cluster Ensembles for High Dimensional Clustering: an empirical study. Technical report CS06-30-02.
  12. Von Luxburg U. , Bubeck S. , Jegelka S. and Kaufmann M. 2007. Consistent minimization of clustering objective functions. Neural information processing systems NIPS.
  13. Charbonier S. and Gentil S 2007. A trend-based alarm system to improve patient monitoring in intensive care units, in Control engineering practice, 15, 1039-1050.
  14. Payne T. R. and Edwards P. 1998. "Implicit feature selection with the value difference metric". 13th European conference on artificial intelligence, pp. 450-454.
  15. http://archive. ics. uci. edu/ml/.
  16. Cortes C. and Vapnick V. 1995. Support-vector networks. Machine learning. 20, pp. 273-297.
Index Terms

Computer Science
Information Sciences

Keywords

Unsupervised feature extraction similarity measure clustering face recognition and classification problems