International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 95 - Number 15 |
Year of Publication: 2014 |
Authors: Sesham Anand, P. Padmanabham, A. Govardhan |
10.5120/16673-6677 |
Sesham Anand, P. Padmanabham, A. Govardhan . Application of Factor Analysis to k-means Clustering Algorithm on Transportation Data. International Journal of Computer Applications. 95, 15 ( June 2014), 40-46. DOI=10.5120/16673-6677
Factor Analysis is a very useful linear algebra technique used for dimensionality reduction. It is also used for data compression and visualization of high dimensional datasets. This technique tries to identify from among a large set of variables, a reduced set of components which summarizes the original data. This is done by identifying groups of variables which have a strong inter correlation. The original variables are transformed into a smaller set of components which have a strong linear correlation. Using several data analysis techniques like Principal Components Analysis (PCA), Factor Analysis, cluster analysis may give insight into the patterns present in the data but may also give different results. The aim of this work is to study the use of Factor Analysis (FA) in capturing the cluster structures from transportation (HIS) data. It is proposed to compare the clustering obtained from original data from that of factor scores. Steps involved in preprocessing the transportation data are also illustrated.