CFP last date
20 January 2025
Reseach Article

Clustering Web Usage Data using Concept Hierarchy and Self Organizing Map

by T. Vijaya Kumar, H. S. Guruprasad
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 56 - Number 18
Year of Publication: 2012
Authors: T. Vijaya Kumar, H. S. Guruprasad
10.5120/9003-2842

T. Vijaya Kumar, H. S. Guruprasad . Clustering Web Usage Data using Concept Hierarchy and Self Organizing Map. International Journal of Computer Applications. 56, 18 ( October 2012), 38-44. DOI=10.5120/9003-2842

@article{ 10.5120/9003-2842,
author = { T. Vijaya Kumar, H. S. Guruprasad },
title = { Clustering Web Usage Data using Concept Hierarchy and Self Organizing Map },
journal = { International Journal of Computer Applications },
issue_date = { October 2012 },
volume = { 56 },
number = { 18 },
month = { October },
year = { 2012 },
issn = { 0975-8887 },
pages = { 38-44 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume56/number18/9003-2842/ },
doi = { 10.5120/9003-2842 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:59:13.452721+05:30
%A T. Vijaya Kumar
%A H. S. Guruprasad
%T Clustering Web Usage Data using Concept Hierarchy and Self Organizing Map
%J International Journal of Computer Applications
%@ 0975-8887
%V 56
%N 18
%P 38-44
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering Web Usage data is one of the important tasks of Web Usage Mining, which helps to find Web user clusters and Web page clusters. Web user clusters establish groups of users exhibiting similar browsing patterns and Web page clusters provide useful knowledge to personalized Web services. Different types of clustering algorithms such as partition based, distance based, density based, grid based, hierarchical and fuzzy clustering algorithms are used to find clusters from Web usage data. These clustering algorithms require more space and time forlargerWeb server log files. K-Means algorithm has been one of the most widely used algorithms for clustering Web usage data due to its computational performance. Although K-Means algorithm is relatively fast and efficient compared to other clustering algorithms, it has some major drawbacks. The number of clusters must be specified in advance. The initial cluster centroids are selected randomly. Clustering result depends on the selection of randomly selected initial cluster centroids and different runs on the same input data might produce different results. K-Means algorithm is sensitive to noisy data and outliers. Recent studies have supported the use of neural networks such as Adaptive Resonance Theory (ART) and Self Organizing Maps (SOM) for real world data mining problems. Among the architectures and algorithms suggested for neural networks, the SOM has special property of effectively creating spatially organized internal representations of various features of input data and their abstractions. In this paper we propose a framework for finding useful information from Web Usage Data that uses SOM. First we have constructed the sessions using concept hierarchy and link information. Then SOM is used to cluster the sessions. We provide experimental results to show the benefits of using concept hierarchy for synaptic weights and clustering Web usage data using SOM. In this paper, we have considered the server log files of the Website www. enggresources. com for overall study and analysis.

References
  1. Kalyan Beemanapalli, Jaideep Srivastava, and Sigal Sahar, "Incorporating Concept Hierarchies into Usage Mining Based Recommendations", WEBKDD'06, August 20, 2006, Philadelphia, USA, ACM.
  2. R. Cooley, B. Mobasher, and J. Srivastava, "Web mining: information and pattern discovery on the World Wide Web", Ninth IEEE International Conference on Tools with Artificial Intelligence, Newport Beach, CA, USA, 1997, Pages 558-567.
  3. J. Srivastava, R. Cooley, M. Deshpande, and P. N. Tan, "Web usage mining: discovery and applications of usage patterns from Web data", ACM SIGKDD Explorations Newsletter, Volume 1, Pages 12-23, 2000.
  4. Bamshad Mobasher, Chapter: 12, "Web Usage Mining in Data Collection and Pre-Processing", ACM SIGKKD 2007 Pages 450-483.
  5. Natheer Khasawneh and Hien-Chung Chan, "Active User-Based and Ontology-Based Weblog data preprocessing forWeb Usage Mining",IEEE/ WIC/ACM International Conference 2006.
  6. Kobra etminani, Amin, and Noorali Rouhani, "Web usage Mining:Discovery of the user's navigational patterns using SOM", IEEE 2009.
  7. Sebastian A. Rios, and Juan D. Velasquez, "Semantic Web Usage Mining by a Concept-based approach for Off-line Web Site Enhancements", IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 2008.
  8. Murat Ali Bayir, Ismail Hakki Toroslu, Guven Fidan, and Ahmet Cosar, "Smart Miner: A New Framework for Mining Large Scale Web Usage Data", ACM 2009.
  9. Norwati Mustapha, Manijeh Jalali , and Mehrdad Jalali, "Expectation Maximization Clustering Algorithm for User Modeling in Web Usage Mining Systems", European Journal of Scientific Research ISSN 1450-216X Volume 32 Number. 4 (2009), Pages. 467-476.
  10. Saeed R. Aghabozorgi, and Teh Ying Wah, "Dynamic Modeling by Usage Data for Personalization Systems", 13th International Conference on Information Visualization IEEE 2009.
  11. Olfa Nasraoui, Maha Soliman,Esin Saka,Antonio Badia, and Richard Germain, "Web Usage Mining Framework for Mining Evolving User Profiles in Dynamic Web Sites", IEEE transactions on knowledge and data engineering, Volume. 20, Number. 2, February 2008.
  12. Jiyang Chen, Lisheng Sun, Osmar R. Zaiane, and Ranidy Goeble, "Visualizing and Discovering Web Navigational Patterns", Seventh International Workshop on the Web and Databases (Web DB 2004),June17-18, 2004, Paris, France.
  13. Esin Saka, and Olfa Nasraoui, "Simultaneous Clustering and Visualization of Web Usage Data using Swarm-based Intelligence", 20th IEEE International Conference on Tools with Artificial Intelligence.
  14. Sungjune Park, Nallan C. Suresh, and Bong Keun Jeong, "Sequence based clustering for Web usage mining: A new experimental framework and ANN- enhanced K-Means algorithm", Elsevier Data and Knowledge Engineering 65 (2008) 512 – 543.
  15. Santosh K. Rangarajan, Vir V. Phoha, Kiran S. Balagani, Rastko R. Selmic and S. S. Iyengar, "Adaptive Neural Network Clustering of Web Users", IEEE 2004 0018-9162/04.
  16. Antonio S, Jose D. Martin, Emilio S, Alberto P, Rafael M and Antonio, "Web mining based on growing hierarchical Self Organizing Maps: Analysis of a real citizen Web portal", Expert Systems with applications 34(2008)2998-2994 www. elsevier. com
  17. T. Kohonen, S. Kaski, K. Lagus, J. Salojarvi, J. Honkela,V. Paatero,A. and Saarela, "Self organization of a massive document collection", IEEE Transactions on Neural Networks 11 (3)(May 2000) 574– 585.
  18. Samuel Kaski, Timo Honkela, Krista Lagus, and Teuvo Kohonen, "WEBSOM-Self organizing maps of document collections", Neurocomputing 21(1998) 101-117 Elsevier.
  19. Kate A. Smith, and Alan Ng, "Web page clustering using a self-organizing map of user navigation patterns", ElsevierDecision Support Systems 35 (2003) 245– 256.
  20. G. T. Raju, and P. S. Satyanarayana "Knowledge Discovery from Web Usage Data: Complete Preprocessing Methodology", IJCSNS International Journal of Computer Science and Network Security, Volume. 8 Number. 1, January 2008.
  21. C. Shahabi and F. B. Kashani, "Efficient and anonymous Web-usage mining for Web personalization", INFORMS Journal on Computing, 15(2) Pages 123-147, 2003.
  22. M. Spiliopoulou, B. Mobasher, B. Berendt, and M. Nakagawa, "A framework for the evaluation of session reconstruction heuristics in Web usage analysis", INFORMS Journal on Computing, 15(2), Pages 171-190, 2003.
  23. T. Vijaya Kumar, Dr. H. S. Guruprasad, Bharath Kumar K. M, Irfan Baig and Kiran Babu S,"A New Web Usage Mining approach for Website recommendations using Concept hierarchy and Website Graph", International Journal of Computer and Electrical Engineering (IJCEE, ISSN: 1793-8198 (Online Version);1793-8163( print version).
  24. Simon Haykin, "Neural Neworks – A Comprehensive Foundation", Prentice-Hall, Inc-1999.
  25. Teuvo Kohonen, "The Self Organizing-Map", Proceedings ofthe IEEE, VOl. 78, No. 9,September1990.
Index Terms

Computer Science
Information Sciences

Keywords

Concept hierarchy Web usagemining Concept based Website graph Self-Organizing Maps Synaptic weight vector