CFP last date
20 January 2025
Reseach Article

An Efficient Algorithm for Data Cleaning of Log File using File Extensions

by Surbhi Anand, Rinkle Rani Aggarwal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 48 - Number 8
Year of Publication: 2012
Authors: Surbhi Anand, Rinkle Rani Aggarwal
10.5120/7367-0097

Surbhi Anand, Rinkle Rani Aggarwal . An Efficient Algorithm for Data Cleaning of Log File using File Extensions. International Journal of Computer Applications. 48, 8 ( June 2012), 13-18. DOI=10.5120/7367-0097

@article{ 10.5120/7367-0097,
author = { Surbhi Anand, Rinkle Rani Aggarwal },
title = { An Efficient Algorithm for Data Cleaning of Log File using File Extensions },
journal = { International Journal of Computer Applications },
issue_date = { June 2012 },
volume = { 48 },
number = { 8 },
month = { June },
year = { 2012 },
issn = { 0975-8887 },
pages = { 13-18 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume48/number8/7367-0097/ },
doi = { 10.5120/7367-0097 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:43:32.775455+05:30
%A Surbhi Anand
%A Rinkle Rani Aggarwal
%T An Efficient Algorithm for Data Cleaning of Log File using File Extensions
%J International Journal of Computer Applications
%@ 0975-8887
%V 48
%N 8
%P 13-18
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

World Wide Web is a monolithic repository of web pages that provides the Internet users with heaps of information. With the growth in number and complexity of Websites, the size of web has become massively large. Web Usage Mining is a division of web mining that involves application of mining techniques to web server logs in order to extract the behavior of users. A Web Usage Mining process comprises of three phases: data preprocessing, patterns discovery and pattern analysis. Data preprocessing tasks are carried out former to the application of mining algorithms. Preprocessing enables to translate the unprocessed data which is composed from server log files into constructive data abstraction. The appropriate analysis of a web server log proves to be beneficiary to manage the websites efficiently from the administrative and users' prospective. Preprocessing results also strongly influences the later phases of Web Usage Mining. This makes the preprocessing of server log files a significant step in Web Usage Mining. This paper emphasizes on the Web Usage Mining process and makes an exploration in the field of data cleaning.

References
  1. Etzioni, O. , 1996. The World Wide Web: quagmire or gold mining?, Appears in Communications of the ACM, 1-6.
  2. Han, Q. , Gao, X. , and Wu, W. , 2008. Study on Web Mining Algorithm Based on Usage Mining. In the Proceedings of 9th International Conference on Computer-Aided Industrial Design and Conceptual Design, 1121-1124.
  3. Aye, T. T. , 2011. Web Log Cleaning for Mining of Web Usage Patterns. In the proceedings of 3rd International Conference on Computer Research and Developments, 490-494.
  4. Srivastava J. and Cooley, R. , 2000. Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data, ACM SIGKDD Explorations, 1(2), 12-23.
  5. Dong, D. , 2009. Exploration on Web Usage Mining and its Application. International Workshop on Intelligent Systems and Applications.
  6. Raju G. T. and Sathyanarayana, P. S. , 2008. "Knowledge discovery from Web Usage Data: Complete Preprocessing Methodology", International Journal of Computer Science and Network Security, 8(1), 179-186.
  7. Chaofeng, L. , 2006. Research and Development of Data Preprocessing in Web Usage Mining. In the Proceedings of International Conference on Management Science and Engineering, 1311-1315.
  8. Tanasa D. and Trousse, B. , 2004. Advanced data preprocessing for intersites Web usage mining. IEEE Intelligent Systems, 19(2), 59-65.
  9. Shahabi, C. , Zarkessh, A. M. , Abidi, J, and Shah, V. , 1997. Knowledge discovery from users Web page navigation. In Seventh International Workshop on Workshop on Research Issues in Data Engineering, 20-29.
  10. Li, Y. , Feng, B. and Mao, Q. , 2008. Research on Path Completion Technique in Web Usage Mining International Symposium on Computer Science and Computational Technology ISCSCT '08, 1, 554 - 559.
  11. Li, Y. and Feng, B. , 2009. The Construction of Transactions for Web Usage Mining. In the Proceedings of International Conference on Computational Intelligence and Natural Computing CINC'09, 1, 121 - 124.
Index Terms

Computer Science
Information Sciences

Keywords

World Wide Web Preprocessing Web Usage Mining Data Cleaning Web Server Logs