CFP last date
20 January 2025
Reseach Article

A Hand to Hand Taxonomical Survey on Web Mining

by Neha Sharma, Sanjay Kumar Dubey
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 60 - Number 3
Year of Publication: 2012
Authors: Neha Sharma, Sanjay Kumar Dubey
10.5120/9670-4091

Neha Sharma, Sanjay Kumar Dubey . A Hand to Hand Taxonomical Survey on Web Mining. International Journal of Computer Applications. 60, 3 ( December 2012), 7-14. DOI=10.5120/9670-4091

@article{ 10.5120/9670-4091,
author = { Neha Sharma, Sanjay Kumar Dubey },
title = { A Hand to Hand Taxonomical Survey on Web Mining },
journal = { International Journal of Computer Applications },
issue_date = { December 2012 },
volume = { 60 },
number = { 3 },
month = { December },
year = { 2012 },
issn = { 0975-8887 },
pages = { 7-14 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume60/number3/9670-4091/ },
doi = { 10.5120/9670-4091 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:05:37.510677+05:30
%A Neha Sharma
%A Sanjay Kumar Dubey
%T A Hand to Hand Taxonomical Survey on Web Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 60
%N 3
%P 7-14
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The data mining techniques exploitation in the field of web is referred as web mining. The enormous data is present at the websites and this need to be tackled well with the help of different data mining techniques. Searching, puling data together and analyzing the data are the main focus of web mining. The application of web mining is in the field of e-commerce and e-learning, web search, database, AI, information retrieval, system improvement etc. Information extraction from the web documents is a typical task and can be done efficiently after the through study of mining. This paper would facilitate to comprehend the concept of web mining by analyzing the facts retrieved from various sources. The paper presents the literature survey on web mining. It also explains the detailed view of three kinds of web mining techniques viz. web content mining, web structure mining and web usage mining. For the survey, different papers are analyzed and then presented as the study of web mining and its subtasks.

References
  1. Vel L. , Royakkers L. , "Ethical Issues in Web Mining", Ethics and Information Technology 6: 129–140, 2004.
  2. Dzitac I. and Moisil I. ,"Advanced AI Techniques for Web Mining", Proceeding of the 10th WSEAS international conference on Mathematical methods, computational techniques and intelligent systems,2008.
  3. Yadav S. , Ahmad K. and Shekar J. ,"Analysis of web mining applications and beneficial areas",IIUM Engineering Journal,vol. 12,No. 2 , 2011 .
  4. Etzioni O. , " The world wide web: Quagmire or gold mine", Communications of the ACM, Volume 39 Issue 11, Nov. 1996,Pages 65 - 68 .
  5. Kosala R. and Blockeel H. ,"Web Mining Research: A Survey", ACM SIGKDD Explorations Newsletter, June 2000, Volume 2 Issue 1. .
  6. Chang C. , Lui S. , Wu Y. ,"Applying Pattern Mining to Web Information Extraction", Advances in Knowledge Discovery and Data,2001 – Springer.
  7. Li A. , Zhang L. " A Study of the Gap from Data Mining to its Application with Cases", International Conference on Business Intelligence and Financial Engineering", 2009.
  8. Yusifov F. F. ,"Web Traffic Mining using Neural Networks", World Academy of Science, Engineering and Technology 21 2008
  9. Punin J, Krishnamoorthy M, Zaki M (2001)," Web usage mining: Languages and algorithms", Proceedings of Studies in classification, data analysis, and knowledge organization,Springer, Heidelberg.
  10. Srivastava J. , Cooley R. , Deshpande M. ,Tan P. ,"Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data", ACM SIGKDD Explorations Newsletter volume1,issue 2, 12-23,2000.
  11. Seydim Y. A. , "intelligent agents: a data mining perspective" Techreport,1999.
  12. Cooley R. , Mobasher B. , Srivastava J. ,"Web Mining: Information and Pattern Discovery on the World Wide Web", Proceedings, Ninth IEEE International Conference on Tools with Artificial Intelligence, 1997.
  13. Pater M. , Popescu E. D. , Mastei D. ," Pattern discovery techniques in Web mining", Journal of Computer Science and Control Systems,volume 1,issue 1,2008.
  14. Eirinaki M. , Vazirgiannis M. , "Web Mining for Web Personalization",Journal ACM Transactions on Internet Technology (TOIT), Volume 3 Issue 1, February 2003.
  15. Fu Y. , Shih M. ,Creado M. ,Ju1 C. ," Reorganizing Web Sites Based on User Access Patterns",Proceeding of the tenth international conference on Information and knowledge managemen, Pages 583 – 585,2001.
  16. Zheng T. ,Niu Y. , Goebel R. ," WebFrame: In Pursuit of Computationally and Cognitively Efficient Web Mining", Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining,2002.
  17. Srivastava J. , Desikan P. , Kumar V. ,"Web Mining - Concepts,Applications & Research Directions", Foundations and Advances in Data Mining,Studies in Fuzziness and Soft Computing Volume 180, 2005, pp 275-307.
  18. Alhawamdeh A. A. M. ," Web Mining:Strategic Web Site Design for Small Business",Proceedings of the World Congress on Engineering, WCE 2007, London, UK, 2-4 July, 2007.
  19. Jain D. ,Sinhal A. ,Gupta N. ,Narwariya P. , Sarawat D. and Pandey A. ,"Hiding Sensitive Association Rules without Altering the Support of Sensitive Item(s)", International Journal of Artificial Intelligence & Applications (IJAIA), Vol. 3, No. 2, March 2012.
  20. Júnior C. G. M. ,Gong Z. ," Web Structure Mining: An Introduction",IEEE International Conference on Information Acquisition, 2005.
  21. Maes P. ," Agents that reduce work and information overload",Communications of ACM,Volume 37 Issue 7, July 1994,Pages 30 – 40.
  22. Fayyad U. ," From data mining to knowledge discovery: An overview",Advances in Knowledge Discovery and Data Mining, pp. 1-34, AAAI Press, 1996.
  23. Gomes, M. and Gong, Z. , 2005, "Web Structure Mining: An Introduction", Proceedings of the 2005 IEEE International Conference on Information Acquisition
  24. Brown, C. M. , Danzig, P. B. Hardy, D. , Manber, U. , and Schwartz, M. F. ,"The harvest information discovery and access system", In Proceedings of the 2d International World Wide Web Conference, 1994, pp. 763–771.
  25. Bowman, C. M. , Danzig, P. B. , Hardy, D. R. , Manber, U. , Schwartz, M. F. , Wessels, D. P. , "Harvest: A Scalable, Customizable Discovery and Access System", Technical Report, University of Colorado,Boulder, Colorado, USA, 1995.
  26. Chawathe S. , Garcia-Molina H. , Hammer J. ,Ireland K. ,Papakonstantinou Y. , Ullman J. , Widom J. ," The TSIMMIS project: Integration of heterogenous information sources", Proceedings of IPSJ, Tokyo, Japan, October 1994.
  27. Hammond K. , Burke R. , Martin C. , and Lytinen S. ," FAQ finder: A case-based approach to knowledge navigation",In Working Notes of the AAAI Spring Symposium: Information gathering from Heterogeneous, Distributed Enviornments, 1995, AAAI Press, Stanford University, pp. 69–73.
  28. Perkowitz, M. and Etzioni, O. ," Category translation: Learning to understand information on the internet", Proceedings of the Fifteenth International Joint Conference on AI, (Montreal, Can. ), Aug. 1995, pp. 930–936
  29. Doorenbos R. B. , Etzioni O. and Weld D. S. ," A scalable comparison- shopping agent for the world-wide web",Technical Report 96–01–03, University of Washington, Dept. of Computer Science and Engineering, January 1996.
  30. Abiteboul S. , Quass D. , McHugh J. , Widom J. and Wiener J. ," The Lorel query language for semistructured data",Journal of digital Libraries, November 1996.
  31. Baldi M. ,Damiani E. , and Insaccanebbia F. ," Structuring and querying the Web through graph-oriented languages" , Proceedings of SEBD 97, SEBD Conferences, Verona, Italy, June 1997.
  32. Atzeni P. , Masci A. ,Mecca G. ,Merialdo P. , and Tabet E. ," ULIXES: Building relational views over the Web",Proceedings of the 13th International Conference on Data Engineering (ICDE'97), April 1997. IEEE.
  33. Long F. ,Zhang H. and Feng. D. D. ,"Fundamentals of content based image retrieval",www. cse. iitd. ernet. in/~pkalra/siv864/Projects/ch01_Long_v40-proof. pdf
  34. Zhang H. ,Chen Z. ,Li M. and Su Z. , "Relevance feedback and learning in content-based image search", World Wide Web 6(2) (2003) 131–155.
  35. Chen L. ,Lian W. and Chue W. , "Using web structure and summarization techniques for web content mining",International Journal on Information Process Management 41(5) (2005) 1225– 1242.
  36. Campos R. ,Dias G. ,Nunes C. , "WISE: Hierarchical Soft Clustering of Web Page Search Results Based on Web Content Mining Techniques," wi, pp. 301- 304, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI'06), 2006.
  37. Hosseini M. ,Hassani A. H. ,"Mining Search Engine Query Log for Evaluating Content and Structure of a Web Site" in Proceedings of the 2007 IEEE/WIC/ACM International Conference on Web Intelligence.
  38. Rahmani R. ,Goldman S. A. , Zhang H. ,Cholleti S. R and Fritts J. E. , "Localized Content-Based Image Retrieval", IEEE Transactions on Pattern Analysis and Machine Intelligence , vol. 30, no. 11, pp. 1902-1912,Nov. 2008
  39. Poonkuzhali G. ,Thiagarajan K. "Signed Approach for Mining Web Content Outliers", World Academy of Science, Engineering and Technology 56,2009
  40. Madria K. S. , Bhowmick S. S. ,Ng K. W. and Lim E. ," Research Issues in Web Data Mining", Proceedings of the First International Conference on Data Warehousing and Knowledge Discovery, p. 303-312, September 01, 1999
  41. Kleinberg K. J. ," Authoritative sources in a hyperlinked environment", Journal of the ACM, 46(5):604–632, 1999.
  42. Furnkranz J. ," Web structure mining — Exploiting the graph structure of the worldwide web", OGAI-J. 21(2) (2002) 17–26
  43. Smith A. K. and Ng A. ," Web page clustering using a self-organizing map of user navigation patterns", Decision Support Syst. 35(2) (2003) 245–256.
  44. Fang X. and Sheng O. , "LinkSelector: A web mining approach to hyperlink selection for web portals", ACM Trans. Internet Tech. 4(2) (2004) 209–237
  45. Hay B. , Wets G. and Vanhoof K. , "Mining navigation patterns using a sequence alignment method", Knowledge Inform. Syst. 6(2) (2004) 150–163
  46. Guan S. and McMullen P. , "Organizing information on the next generation web —design and implementation of a new bookmark structure", International Journal Inform Technol. Decision Making 4(1) (2005) 97–115
  47. Song Q. and Shepperd M. ," Mining web browsing patterns for e-commerce", Computation Industry 57(7) (2006)622–630
  48. Chikhi F. N. ,Rothenburger B. and Aussenac-Gilles N. "A Comparison of Dimensionality Reduction Techniques for Web Structure Mining", Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence,P. 116-119 ,2007
  49. Moussiades L. and Vakali A. , "Mining the Community Structure of a Web Site", bci, pp. 239-244, 2009 Fourth Balkan Conference in Informatics, 2009
  50. Zhu J. and Hong J. ," Using Markov Models for Web Site Link Prediction" ,College Park, Maryland, USA ACM June 11-15, 2002
  51. Borges and Levene M. ,"A dynamic clustering-based markov model for web usage Mining", cs. IR/0406032,2004
  52. Xiaoqiu T. and Min Y. ," Mining Maximal Frequent Access Sequences Based on Improved WAPtree" ,Proceedings of the Sixth International Conference on Intelligent Systems Design and Applications(ISDA'06)
  53. Tao H. Y. ,Hong P. T. and Su M. Y. ," Web usage mining with intentional browsing data" , international journal of Expert Systems with Applications 34 (2007) 1893–1904
  54. Jalali M. and Mustapha N. ," A Recommender System Approach for Classifying User Navigation Patterns Using Longest Common Subsequence Algorithm", American Journal of Scientific Research ISSN 1450-223X Issue 4 (2009), pp 17-27
  55. Jalali M. and Mustapha N. ," WebPUM: A Web-based recommendation system to predict user future movements", international journal Expert Systems with Applications 37 (2010) 6201–6212
  56. Yong-gui W. and Zohn J. ,"Research on Semantic Web Mining", 2010 International Conference On Computer Design And Appliations (ICCDA 2010).
  57. Gaol L. F. ," Exploring The Pattern of Habits of Users Using Web Log Squential Pattern", Second International Conference on Advances in Computing, Control, and Telecommunication Technologies,2010.
  58. Meo R. ,Lanzi L. P. ,Matera M. And Esposito R. ," Integrating Web Conceptual Modeling and Web Usage Mining", Proceedings of the sixth WEBKDD workshop: Webmining and Web Usage Analysis (WEBKDD'04), in conjunction with the 10th ACM SIGKDD conference (KDD'04), Seattle, Washington, USA, 2004.
  59. Mobasher B. , Dai H. ,Luo T. And Nakagawa M. , "Using Sequential and Non-Sequential Patterns in PredictiveWeb Usage Mining Tasks", Proceeding ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining, IEEE Computer Society Washington, DC, USA ©2002.
  60. Esmin A. , Lima J. , Yano,Tiago E. T. And Carneiro G. S. ,"ArchCollect - A Tool for WEB Usage Knowledge Acquisition from User's Interactions", Proceedings of the Tenth International Conference on Enterprise Information Systems, Barcelona, Spain, pp. 375-380,2008.
  61. Abraham A. ,"i-Miner: A Web Usage Mining Framework Using Hierarchical Intelligent Systems", IEEE International Conference on Fuzzy Systems FUZZ-IEEE'03, IEEE Press , pp. 1129-1134 .
  62. Tiedtke T. ,Märtin C. and Gerth N. ,"AWUSA – A Tool for Automated Website Usability Analysis", PreProceedings of the 9th International Workshop on the Design, Specification and Verification of Interactive Systems,2009.
  63. Hong I. J. , Heer J. , Waterson S. and Landay A. J. , "WebQuilt: A proxy-based approach to remote web usability testing", ACM Transactions on Information Systems, 19(3), 2001, 263-285.
  64. Zaiane R. O. ," Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs", Proceedings IEEE International Forum on Research and Technology Advances in Digital Libraries ADL98, IEEE Computer Society, Santa Barbara, CA,1998, 19-29.
  65. Pierrakos D. ,Paliouras G. , Papatheodorou C. and Spyropoulos C. D. ,"KOINOTITES: A Web Usage Mining Tool for Personalization", Proceedings of Panhellenic Conference on Human Computer Interaction, Greece, Patras, pp. 231-236,2000.
  66. Shahabi C. , Faisal A. , Kashani F. B. and Faruque J. ,"INSITE: A Tool for Real-Time Knowledge Discovery from Users Web Navigation", Proceedings of the 26th International Conference on Very Large Databases (VLDB), Cairo, Egypt, pp. 635-638,2000.
  67. Berendt B. ,"Web usage mining, site semantics, and the support of navigation", KDD Workshop on Web Mining for ECommerce Challenges and Opportunities" pp. 83–93,2000.
  68. Eirinaki M. , Vazirgiannis M. and Varlamis I. ,"SEWeP: using site semantics and a taxonomy to enhance the Web personalization process", Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 99-108,2003.
  69. Masseglia F. , Poncelet P. and Cicchetti R. ,"WebTool: An Integrated Framework for Data Mining", Proceedings of the 10th International Conference on Database and Expert Systems Applications (DEXA '99), Trevor J. M. Bench-Capon, Giovanni Soda, and A. Min Tjoa (Eds. ). Springer- Verlag, London, UK, pp. 892-901,1999.
  70. Cooley R. , Tan P. N. and Srivastava J. ,"WebSIFT: The Web Site Information Filter System", Proceedings of Workshop on Web Usage Analysis and User Profiling WEBKDD in conjunction with ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 1999, San Diego, California, USA.
  71. Buchner G. A. ,Baumgarten M. , Anand S. S. ,Mulvenna D. M. , and Hughes H. J. ," Navigation Pattern Discovery from Internet Data", Proceedings of Web Usage Analysis and User Profiling at the International WEBKDD99 Workshop, 2000, 74- 91.
  72. Spiliopoulou M. and Faulstich L. C. ,"WUM : A Web Utilization Miner", EDBT Workshop on Web Databases, pp. 1-7, Valencia, Spain,1998.
  73. Chen L. , and Sycara K. , "WebMate: A Personal Agent for Browsing and Searching", Proceedings of the 2nd International Conference on Autonomous Agents, Minneapolis MN, USA, 1999, 132-139.
  74. Wu K. L. , Yu P. S. and Ballman A. ,"SpeedTracer: A Web usage mining and analysis tool", IBM Systems Journal on Internet Computing, Vol. 37, pp. 89-105,1998.
Index Terms

Computer Science
Information Sciences

Keywords

Web mining web content mining web structure mining web usage mining information retrieval information extraction