International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 56 - Number 18 |
Year of Publication: 2012 |
Authors: T. Vijaya Kumar, H. S. Guruprasad |
10.5120/9003-2842 |
T. Vijaya Kumar, H. S. Guruprasad . Clustering Web Usage Data using Concept Hierarchy and Self Organizing Map. International Journal of Computer Applications. 56, 18 ( October 2012), 38-44. DOI=10.5120/9003-2842
Clustering Web Usage data is one of the important tasks of Web Usage Mining, which helps to find Web user clusters and Web page clusters. Web user clusters establish groups of users exhibiting similar browsing patterns and Web page clusters provide useful knowledge to personalized Web services. Different types of clustering algorithms such as partition based, distance based, density based, grid based, hierarchical and fuzzy clustering algorithms are used to find clusters from Web usage data. These clustering algorithms require more space and time forlargerWeb server log files. K-Means algorithm has been one of the most widely used algorithms for clustering Web usage data due to its computational performance. Although K-Means algorithm is relatively fast and efficient compared to other clustering algorithms, it has some major drawbacks. The number of clusters must be specified in advance. The initial cluster centroids are selected randomly. Clustering result depends on the selection of randomly selected initial cluster centroids and different runs on the same input data might produce different results. K-Means algorithm is sensitive to noisy data and outliers. Recent studies have supported the use of neural networks such as Adaptive Resonance Theory (ART) and Self Organizing Maps (SOM) for real world data mining problems. Among the architectures and algorithms suggested for neural networks, the SOM has special property of effectively creating spatially organized internal representations of various features of input data and their abstractions. In this paper we propose a framework for finding useful information from Web Usage Data that uses SOM. First we have constructed the sessions using concept hierarchy and link information. Then SOM is used to cluster the sessions. We provide experimental results to show the benefits of using concept hierarchy for synaptic weights and clustering Web usage data using SOM. In this paper, we have considered the server log files of the Website www. enggresources. com for overall study and analysis.