CFP last date
20 January 2025
Reseach Article

Dimension Reduction and Clustering of High Dimensional Data using Auto-Associative Neural Networks

by Zalhan Mohd Zin, Rubiyah Yusof, Ehsan Mesbahi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 72 - Number 11
Year of Publication: 2013
Authors: Zalhan Mohd Zin, Rubiyah Yusof, Ehsan Mesbahi
10.5120/12540-9090

Zalhan Mohd Zin, Rubiyah Yusof, Ehsan Mesbahi . Dimension Reduction and Clustering of High Dimensional Data using Auto-Associative Neural Networks. International Journal of Computer Applications. 72, 11 ( June 2013), 31-37. DOI=10.5120/12540-9090

@article{ 10.5120/12540-9090,
author = { Zalhan Mohd Zin, Rubiyah Yusof, Ehsan Mesbahi },
title = { Dimension Reduction and Clustering of High Dimensional Data using Auto-Associative Neural Networks },
journal = { International Journal of Computer Applications },
issue_date = { June 2013 },
volume = { 72 },
number = { 11 },
month = { June },
year = { 2013 },
issn = { 0975-8887 },
pages = { 31-37 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume72/number11/12540-9090/ },
doi = { 10.5120/12540-9090 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:37:40.418780+05:30
%A Zalhan Mohd Zin
%A Rubiyah Yusof
%A Ehsan Mesbahi
%T Dimension Reduction and Clustering of High Dimensional Data using Auto-Associative Neural Networks
%J International Journal of Computer Applications
%@ 0975-8887
%V 72
%N 11
%P 31-37
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The task to capture and interpret information hidden inside high-dimensional data can be considered very complicated and challenging. Usually, dimension reduction technique may be considered as the first step to data analysis and exploration. The focus of this paper is on high-dimensional data dimension reduction using a supervised artificial neural networks technique known as Auto-Associative Neural Networks (AANN). The AANN can be considered as a powerful tool in data analysis and clustering with the ability to deal with linear and nonlinear correlation among variables. This technique is sometimes referred to as nonlinear principal component analysis (NLPCA), Encoding-Decoding networks, or bottleneck neural networks (BNN) due to its unique structure. It reduces high-dimensional data into low-dimensional data on its bottleneck layer which can later be used for data transmission, clustering and visualization. In this paper, a structurally flexible AANN is developed by using high level computer language, applied and studied on two case studies of Iris flowers and Italian olive oils datasets. The purpose of the work was to investigate the ability of AANN to reduce dimension of high-dimensional data on small (Iris) and large (Olive) datasets. The results have shown that AANN has been able to compress high-dimensional data into only one or two non-linear principal components at its bottleneck layer with the highest accuracy of 98. 9% and 82. 1% for both datasets respectively. AANN has also managed to perform accurately in both reducing dimension and clustering data by only using small portion of training dataset.

References
  1. M. A. Kramer, "Nonlinear Principle Component Analysis Using Autoassociative Neural Networks," AIChE Journal, vol. 37, pp. 233-243, 1991.
  2. B. W. D. L. M. M. Daszykowski, "A Journey Into Low-dimensional Spaces With Autoassociative Neural Networks," International Journal of Pure and Applied Analytical Chemistry Talanta, vol. 59, pp. 1095-1105, 2003.
  3. L. I. Smith, 2002. [Online]. Available: http://neurobot. bio. auth. gr/2005/a-tutorial-on-principal-components-analysis/. [Accessed 05 February 2012].
  4. S. L. J. L. Giraudel, "A Comparison of Self Organizing Map Algorithm and Some Conventional Statistical Methods for Ecological Community Ordination," Journal of Ecological Modelling, vol. 146, pp. 329-339, 2001.
  5. F. V. N. T. M. Jaisheel Mistry, "Missing Data Estimation Using Principle Component Analysis and Autoassociative Neural Networks," Journal of Systemics, Cybernatics and Informatics, vol. 7, no. 3, pp. 72-79, 2009.
  6. V. M. Stone, "The Autoassociative Neural Network-A Network Worth Considering," in World Automation Congress (WAC), Hawaii, 2008.
  7. P. J. K. M. G. Ali Anaissi, "A Framework for High Dimensional Data Reduction in the Microarray Domain," in IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA), 2010.
  8. T. Kohonen, "The Self-Organizing Maps," in Proceedings of the IEEE, 1990.
  9. A. Flexer, "On the Use of Self Organizing Maps for Clustering and Visualization," in International Conference on Principle on Data Mining and Knowledge Discovery, Prague, Czech Republic, 1999.
  10. E. A. Juha Vesanto, "Clustering of the Self Organizing Maps," IEEE Transaction on Neural Networks, vol. 11, no. 3, pp. 586-600, 2000.
  11. m. F. K. C. L. G. J. K. Matthias Scholz, "Non-linear PCA: A Missing Data Approach," Journal of Bioinformatics, vol. 21, no. 20, pp. 3887-3895, 2005.
  12. S. Y. B. C. M. A. Abidi, "Image Compression Using Hybrid Neural Networks Combining The Auto-Associative Multilayer Perceptron and The Self Organizing Feature Map," IEEE Transaction on Consumer Electronics, vol. 40, no. 4, pp. 796-811, 1994.
  13. B. J. H. J. D. L. Mark J. Embrechts, "Augmented Efficient BackProp for Backpropagation Learning in Deep Autoassociative Neural Networks," in International Joint Conference on Neural Networks (IJCNN), Barcelona, 2010.
  14. J. -C. G. Gaetan Kerschen, "Feature Extraction Using Auto-Associative Neural Networks," Smart Materials and Structures, vol. 13, no. 1, 2004.
  15. M. Negnevitsky, Artificial Intelligence: A Guide to Intelligent Systems 2nd Edition, Pearson Education Limited, 2005.
  16. R. A. Fisher, "UCI Machine Learning Repository," 2006. [Online]. Available: http://archive. ics. uci. edu/ml/datasets/Iris. [Accessed 31 January 2011].
  17. E. Mesbahi, "Cryptic codes in non-coding DNA: Autoassociative Neural Networks and multidimensional Self Organising Maps (SOM) mediated prediction of positional significance of cis-elements in co-regulated expression systems," 29 March 2012. [Online]. Available: http://www. ncl. ac. uk/marine/research/project/1997. [Accessed 15 July 2012].
Index Terms

Computer Science
Information Sciences

Keywords

Dimension Reduction Auto-Associative Neural Networks