Web mining of log files using Hadoop MapReduce

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

Multi-Band RLS Estimation with Rank Two Updates: Application to Short-Term Temperature Forecast

Alexander Stotsky

Random Articles

Shadow Detection Approach Combining Spectral and Geometrical Properties in Highway Video-Surveillance

September

2012

Prequalification of Construction Contractor using a FAHP

August

2011

Advance Technique for Feature Extraction and Image Compression

April

2013

Ornamental Fish Disease Prediction System

Jan

2023

Reseach Article

Web mining of log files using Hadoop MapReduce

Published on April 2012 by Janu Oswal, Poorvi Jain, Rupali Phanase, Shweta Parjane

Emerging Trends in Computer Science and Information Technology (ETCSIT2012)

Foundation of Computer Science USA

ETCSIT - Number 4

April 2012

Authors: Janu Oswal, Poorvi Jain, Rupali Phanase, Shweta Parjane

Janu Oswal, Poorvi Jain, Rupali Phanase, Shweta Parjane . Web mining of log files using Hadoop MapReduce. Emerging Trends in Computer Science and Information Technology (ETCSIT2012). ETCSIT, 4 (April 2012), 39-45.

@article{

author = { Janu Oswal, Poorvi Jain, Rupali Phanase, Shweta Parjane },

title = { Web mining of log files using Hadoop MapReduce },

journal = { Emerging Trends in Computer Science and Information Technology (ETCSIT2012) },

issue_date = { April 2012 },

volume = { ETCSIT },

number = { 4 },

month = { April },

year = { 2012 },

issn = 0975-8887,

pages = { 39-45 },

numpages = 7,

url = { /proceedings/etcsit/number4/5990-1042/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 Emerging Trends in Computer Science and Information Technology (ETCSIT2012)

%A Janu Oswal

%A Poorvi Jain

%A Rupali Phanase

%A Shweta Parjane

%T Web mining of log files using Hadoop MapReduce

%J Emerging Trends in Computer Science and Information Technology (ETCSIT2012)

%@ 0975-8887

%V ETCSIT

%N 4

%P 39-45

%D 2012

%I International Journal of Computer Applications

Abstract

Virtual Database Technology (VDB) is one of the effective solutions for integration of data from heterogeneous sources. This will become complex when size of the database is very large. MapReduce is a new framework specifically designed for processing huge datasets on distributed sources. Apache'sHadoop is an implementation of MapReduce. This pape r poposes to utilize the parallel and distributed processing capability, the virutal servers response to the region wise query. the output will show the graph of oracle space required and the Hadoop space required for the project with the reduced data displayed in the textbox.

References

Wenhao Xu, Jing Li, Yongwei Wu, Xiaomeng Huang, Guangwen Yang, VDM: Virtual Database Management for Distributed and File System, Grid and Cooperative Computing (2008), IEEE.
Yuji Wada, Yuta Watanabe, Keisuke Syoubu, Jun Sawamoto, Takashi Katoh. Virtual Database Technology for Distributed Database, 2010 IEEE 24th, International Conference on Advanced Information Networking and Applications Workshop.
Ferreira. R,Mouraires,J. ,Martins,R. ,Pntoquilho. M. , XML based Metadata Repository for Information Systems, IEEE Artificial intelligence conference, 2005.
Jeffrey Dean and Sanjay Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Google Research Publication (2004)
Simplified Data Processing on Large Clusters. Google Research Publication (2004).
Ralf Lammel. Google's MapReduce Programming Model Revisited. Science of Computer Programming archive. Volume 68, (2008).
Apachee Hadoop, http://Hadoop. apache. org.
Tom White. Hadoop: The Definitive Guide. O'Reilly, Scbastopol, California, 2009.
Gang Chen, Yongwei Wu, Jia Liu, Guangwen Yang and Weimin Zheng. Optimization of subquery processing in distributed data integration systems. Journal of Network and Computer Applications (2010).

Index Terms

Computer Science

Information Sciences

Keywords

Virtual Server Web Mining query Optimization Mapreduce Hadoop.