CFP last date
20 December 2024
Reseach Article

A Top-Down Algorithm for Mining Maximal Traversal Paths in Web Log Sessions

Published on None 2011 by M. Thilagu, R. Nadarajan, R. Jeevitha
Computational Science - New Dimensions & Perspectives
Foundation of Computer Science USA
NCCSE - Number 2
None 2011
Authors: M. Thilagu, R. Nadarajan, R. Jeevitha
c9279454-92ae-46a1-b98b-02ac7d108836

M. Thilagu, R. Nadarajan, R. Jeevitha . A Top-Down Algorithm for Mining Maximal Traversal Paths in Web Log Sessions. Computational Science - New Dimensions & Perspectives. NCCSE, 2 (None 2011), 66-70.

@article{
author = { M. Thilagu, R. Nadarajan, R. Jeevitha },
title = { A Top-Down Algorithm for Mining Maximal Traversal Paths in Web Log Sessions },
journal = { Computational Science - New Dimensions & Perspectives },
issue_date = { None 2011 },
volume = { NCCSE },
number = { 2 },
month = { None },
year = { 2011 },
issn = 0975-8887,
pages = { 66-70 },
numpages = 5,
url = { /specialissues/nccse/number2/1862-164/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Special Issue Article
%1 Computational Science - New Dimensions & Perspectives
%A M. Thilagu
%A R. Nadarajan
%A R. Jeevitha
%T A Top-Down Algorithm for Mining Maximal Traversal Paths in Web Log Sessions
%J Computational Science - New Dimensions & Perspectives
%@ 0975-8887
%V NCCSE
%N 2
%P 66-70
%D 2011
%I International Journal of Computer Applications
Abstract

Mining of frequent traversal paths in web logs is an application of sequence mining and useful with many applications that include web recommendation, caching, pre-fetching etc. Most of the existing algorithms follow a bottom-up approach to mine sequence patterns in a database. In this paper, a fast top-down algorithm is presented to discover maximal traversal paths which are contiguous sequences in web log session sequences. The algorithm avoids candidate sequence generation and searches only maximal potential patterns in the minimized search space during mining process. Experimental results show that the proposed algorithm can perform better than an existing approach.

References
  1. Agrawal R. and Srikant R. (1995) Mining Sequential Patterns. In Proceedings ICDE'95, 3-14.
  2. Antunes C. and Oliveira A. L. (2003) Generalization of Pattern-
  3. Growth Methods for Sequential Pattern Mining with Gap Constraints. In Int'l Conf Machine Learning and Data Mining, 239-251.
  4. Antunes C. and Oliveira A. L. (2004) Sequential pattern mining algorithms: Trade-offs between speed and memory. In 2nd Workshop on Mining Graphs, Trees and Seq, Italy.
  5. Ayres J., Gehrke J., Yu T., and Flannick J. (2002) Sequential PAttern Mining using a Bitmap Representation. In SIGKDD 429-435.
  6. Chen, M.S., Park, J.S. & Yu, P.S. (1998). “Efficient Data Mining for Path Traversal Patterns.” In IEEE Transactions on Knowledge and Data Engineering , 209-220.
  7. Chen J. (2008) Contiguous Item Sequential Pattern Mining Using UpDown Tree, Intelligent Data Analysis – An International Journal, Vol. 12, No. 1, pp. 25-49.
  8. Chen J, Cook T. (2007) Using d-gap Patterns for Index Compression. In WWW, 1209-1210.
  9. Dhany Saputra, Dayang R. A. Rambli, Oi Mean Foong, (2008) Mining Sequential Patterns Using I-PrefixSpan, International Journal of Computer Science and Engineering 2;2.
  10. Han J., Pei J.,Mortazavi-Asl B., Chen Q., Dayal U.,and Hsu M.-C. (2001) FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. In Proc. ACM SIGKDD 355-359.
  11. Lin M. and Lee S. (2005) Fast Discovery of Sequential Patterns through Memory Indexing and Database Partitioning. J. Info. Sci. and Eng., 21, 109-128.
  12. Nakagawa M. and Mobasher B. (2003) A Hybrid Web Personalization Model Based on Site Connectivity. In WEBKDD 59-70.
  13. Pei, J., Han, J., Mortazavi-asi, B. and Zhu, H. (2000). Mining Access Patterns Efficiently from Web Logs. In Proceedings of 6th Pacific Area Conference on Knowledge Discovery and Data Mining (PAKDD), 396-407.
  14. Pei J., Han J., Mortazavi-Asl B., Wang J., Pinto H., Chen Q., Dayal U. and Hsu M. C. (2004) Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE TKDE, vol. 16, 1424-1440.
  15. Show-Jane Yen “An Efficient Approach for Analyzing User Behaviors in a Web-Based Training Environment” Journal of Distance Education Technologies, 1(4), 55-71, Oct-Dec 2003.
  16. Srikant R., Agrawal R., (1996) Mining Sequential Patterns: Generalizations and Performance Improvements. In Int'l Conf Extd. DB. Tech. 3-17.
  17. Zaki M. (2001) SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Lrng., 40, 31-60.
  18. Zhang Z. and Kitsuregawa M. (2005) “LAPIN-SPAM: An Improved Algorithm for Mining Sequential Pattern,” Proc. of Int'l Special Workshop on Databases For Next Generation Researchers, pp. 8-11
Index Terms

Computer Science
Information Sciences

Keywords

Sequence Database Contiguous Sequence Maximal Potential Pattern Maximal Traversal Path