We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

An Exploratory Survey of Hadoop Log Analysis Tools

by Madhury Mohandas, Dhanya P M
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 75 - Number 18
Year of Publication: 2013
Authors: Madhury Mohandas, Dhanya P M
10.5120/13350-0750

Madhury Mohandas, Dhanya P M . An Exploratory Survey of Hadoop Log Analysis Tools. International Journal of Computer Applications. 75, 18 ( August 2013), 33-36. DOI=10.5120/13350-0750

@article{ 10.5120/13350-0750,
author = { Madhury Mohandas, Dhanya P M },
title = { An Exploratory Survey of Hadoop Log Analysis Tools },
journal = { International Journal of Computer Applications },
issue_date = { August 2013 },
volume = { 75 },
number = { 18 },
month = { August },
year = { 2013 },
issn = { 0975-8887 },
pages = { 33-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume75/number18/13350-0750/ },
doi = { 10.5120/13350-0750 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:44:36.553756+05:30
%A Madhury Mohandas
%A Dhanya P M
%T An Exploratory Survey of Hadoop Log Analysis Tools
%J International Journal of Computer Applications
%@ 0975-8887
%V 75
%N 18
%P 33-36
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In view of the fact that clusters used in large scale computing are on the rise, ensuring the wellbeing of these clusters is of paramount significance. This highlights the importance of supervising and monitoring the cluster. In this regard, many tools have been contributed that can efficiently monitor the Hadoop cluster. The majority of these tools congregates necessary information from each of the node in the cluster and takes it for processing. These diagnosis tools are mostly post execution analysis tools. This paper presents an exploratory assessment of the different log analyzers used for failure detection and monitoring in Hadoop.

References
  1. Hadoop, http://hadoop. apache. org/.
  2. W. Tom, Hadoop:the definitive guide( O'reilly media, May 2009)
  3. K. Shvachko, Hdfs scalability: The limits to growth, The USENIX Magazine , 35(2), 2010
  4. S. Ghemawat, H. Gobioff, and Leung, "The Google File System," SIGOPS Oper. Syst. Rev. , 37(5):29–43, 2003
  5. D. Borthakur, HDFS Architecture, http://hadoop. apache. org/common/ docs/r0. 20. 0/ hdfs_design. html, April 2009
  6. K. Shvachko, H. Huang, S. Radia, and R. Chansler, The hadoop distributed file system, In 26th IEEE (MSST2010) Symposium on Massive Storage Systems and Technologies, May 2010.
  7. J. Dean and S. Ghemawat, Mapreduce: simplified data processing on large clusters, In Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6, pages 10–10, Berkeley, CA, USA, 2004.
  8. Scribe, https://github. com/facebook/scribe.
  9. Scribe logfile aggregation system described by Facebook's Jeff Hammerbacher https://issues. apache. org/jira/browse/HADOOP-2206?focusedCommentId=12542775#action 12542775
  10. Chukwa, http://wiki. apache. org/hadoop/Chukwa
  11. Gridmix3 – Emulating Production Workload for Apache Hadoop, www. usenix. org/conference/fast-10/gridmix3-emulating-production-io-workload-apache-hadoop
  12. Vaidya, http://hadoop. apache. org/docs/stable/vaidya. html
  13. Revisiting the physician : Hadoop Vaidya, http://www. hadoopsphere. com/2013/01/revisiting-physician-hadoop-vaidya. html
  14. J. Tan, X. Pan, S. Kavulya, R. Gandhi, and P. Narasimhan, Salsa: Analyzing logs as state machines, In Workshop on Analysis of System Logs, San Diego, CA, Dec 2008.
  15. Log4J, http://logging. apache. org/log4j, 2007
  16. J. Tan, X. Pan, S. Kavulya, R. Gandhi, and P. Narasimhan, Mochi: visual log-analysis based tools for debugging hadoop, In Proceedings of the 2009 conference on Hottopics in cloud computing, HotCloud'09, Berkeley, CA, USA, 2009.
  17. Matthew L. Massie, Brent N. Chun, and David E. Culler, The Ganglia Distributed Monitoring System: Design, Implementation, and Experience, In Parallel Computing Volume 30, Issue 7, pp 817-840, 2004
  18. J. Boulon, A. Konwinski, R. Qi, A. Rabkin, E. Yang, and M. Yang, Chukwa, a large-scale monitoring system, In First Workshop on Cloud Computing and its Applications (CCA '08), Chicago, IL, 2008
  19. Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, and Ion Stoica, X-Trace: A Pervasive Network Tracing Framework, In 4th USENIX Symposium on Networked Systems Design & Implementation (NSDI'07), Cambridge, MA, USA, April 2007
  20. A. Rabkin, R Katz, Chukwa: a system for reliable large-scale log collection, In Proceedings of the 24th International Conference on Large Installation System Administration LISA'10, USENIX Association Berkeley, CA, USA.
Index Terms

Computer Science
Information Sciences

Keywords

Cloud computing HDFS Failure monitoring Hadoop Log analyzer