International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 115 - Number 8 |
Year of Publication: 2015 |
Authors: Jalpa Mehta, Amir Ansari, Aseem Girkar, Ayesha Khanna, Ankit Nagda |
10.5120/20175-2376 |
Jalpa Mehta, Amir Ansari, Aseem Girkar, Ayesha Khanna, Ankit Nagda . Trend Analysis based on Access Pattern over Web Logs using Hadoop. International Journal of Computer Applications. 115, 8 ( April 2015), 34-37. DOI=10.5120/20175-2376
There is an invariable progress and extension of the World Wide Web which has resulted into the generation of log files having enormous magnitude of data. Log files incorporate traits of user behavior, therefore it is essential to analyze log data and acquire knowledge from it. Web mining techniques primarily focuses on deciphering and scrutinizing the navigational behavior of user form various aspects and ascertaining the hidden knowledge from these web logs. As log files over the web are outsized, storage becomes a constraint wherein effective techniques such as virtual database prove to be ineffectual for the same. Conversely, Hadoop offers a large scale distributed batch processing infrastructure that provides adequate data storage, distributive and analogous processing, isolation of process and fault tolerant on occurrences of data loss. This paper characterizes on the dominant approach for managing the large chunk of web log data using Hadoop MapReduce which reduces the response time for throughput generation, loads the log data effectively and ensures reliability. The primary focus of the paper is to construct log analysis system which depicts trends based on the users browsing mode using Hadoop MapReduce which facilitates handling of heterogeneous query execution on log file.