CFP last date
20 January 2025
Reseach Article

Big Data: Does it Call for Distributed File System

by Komal Verma, Rajiv Pandey, Arpit Gupta
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 131 - Number 13
Year of Publication: 2015
Authors: Komal Verma, Rajiv Pandey, Arpit Gupta
10.5120/ijca2015907505

Komal Verma, Rajiv Pandey, Arpit Gupta . Big Data: Does it Call for Distributed File System. International Journal of Computer Applications. 131, 13 ( December 2015), 12-17. DOI=10.5120/ijca2015907505

@article{ 10.5120/ijca2015907505,
author = { Komal Verma, Rajiv Pandey, Arpit Gupta },
title = { Big Data: Does it Call for Distributed File System },
journal = { International Journal of Computer Applications },
issue_date = { December 2015 },
volume = { 131 },
number = { 13 },
month = { December },
year = { 2015 },
issn = { 0975-8887 },
pages = { 12-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume131/number13/23508-2015907505/ },
doi = { 10.5120/ijca2015907505 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:27:13.961164+05:30
%A Komal Verma
%A Rajiv Pandey
%A Arpit Gupta
%T Big Data: Does it Call for Distributed File System
%J International Journal of Computer Applications
%@ 0975-8887
%V 131
%N 13
%P 12-17
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Today, in order to support decision for strategic advantages alignment, companies’ have started to realize the importance of using large data. It is being observed through different study cases that “Large data usually demands for faster processing”. As a result, companies are now investing more in processing larger sets of data rather than investing in expensive algorithms. A larger amount of data gives a better inference for decision making but also working with it can create challenge due to processing limitations. In order to easily manage and use this large amount of content in a proper systematic manner, Big Data, HDFS & other file systems were being introduced. Big data is used for ‘larger data sets having more varied and complex structure, having problems in analyzing, visualizing and storing for further processing’. The process of examining such large amounts of data inorderto reveal hidden patterns and secret correlations is named as Big Data Analytics. The useful information for companies or organizations will help them in gaining richer and deeper insights and getting advantages over the competition. Implementation of this Big Data needs to be analyzed and executed as accurately as possible. This term paper give an overview about what Big Data is, its classification, challenges it faces, need for Distributed File System, Hadoop and its components i.e. Hadoop Distributed file System and Map Reduce, and application of HDFS in Cloud Computing

References
  1. http://hpccsystems.com/ , last success 11.03.2013
  2. J. Manyika, M. chui, B. Brown, J. Bughin, R Dobbs,C. Roxburgh and A. H. Byers, “Big Data: The next frontier for innovation, competition, and productivity”, McKinsey Global Institute, 2011, http://www.mckinsey.com/~/media/McKinsey/dotcom/Insight%20and%20pubs/MGI/Research/Technology%20and%20Innovation/Big%20Data/MGI big data full report.ashx
  3. Xindong wu, Xingquan Zhu, Gong-Qing and Wel Ding, “Data Mining ith Big Data”, IEEE Transaction on knowledge and data engineering, Vol 26, No. 1, January 2014.
  4. Navint, “Why is BIG Data Important?”, a Navint Partners White Paper, May 2012 http://www.navint.comimages/Big.Data.pdf
  5. J, Manyika, M. Chui, B. Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, A.H. Byers, “Big Data: The next frontier of innovation, competition and productivity”, McKinsey Global Institute, May, 2011 http://www.mckinsey.com/~/media/McKinsey/dotcom/Insight%20and%20pubs/MGI/Research/Technology%20and%20Innovation/Big%20Data/MGI big data full report.ashx
  6. http://en.wikipedia.org/wikiApache_Hadoop
  7. E. Geanina ULARU, F. Camelia PUICAN, A. APOSTU, M. VELICANU, “Perspective on Big Data and Bid Data Analytics”, page no.- 9; April, 2012
  8. Prem Jain, Stewart Tate, “Big Data Networked Storage Solution for Hadoop”, IBM ,Redpaper, http://www.ibm.com/redbooks
  9. Geert, “ Big Data Too Big to Ignore”, datacrunches,http://www.people.cs.kuleuven.be~joost.vennekens/DN/bigdata
  10. Hasan Mir, “Addressing Limitations of Distributed File System”
  11. DhrubaBorthaKur, “HDFS Architecture Guide”
  12. page no.- 4
  13. R. M. Kharode, A.R. Deshmukh, “International Journal of Advanced Research in Computer Science and Software Engineering”, “Study of Hadoop Distributed File System in Cloud Computing”, Vol. 5, Issue 1, January,2015 http://www.ijarcsse.com
  14. http://www.informationweek.com/software/businessintelligance/sas-gets-hip-tp-hadoop-for-big-dta/240009035?pgno=2
  15. Apache Hadoop, http://Hadoop.apache.org/
  16. Introduction to Cloud Computing by Shang Juh Kao
  17. K. Bakshi, “Considerations for Big Data: Architecture and Approach”, Aerospace Conference IEEE, Big Sky Montana, March 2012
  18. Bernice Purcell, “Th emergence of big data technology and analytics”, journal of Technology Research, http://www.aabri.commanuscript/121219
  19. C. Eaton, D. Deroos, T. Deutsch, G. Lapis, P. Zikopolos, “Understanding Big Data”, “Analytics for Enterprise class Hadoop and Streaming Data”
  20. S. Blazhievsky, N. Systems, “Introduction to Hadoop, Map Reduce and HDFS for Big Data Application, http://www.snia.org/sites/defaulteducation/tutorials/2013/fall/BigData/SergeBazhievsky_Introduction_to_Hadoop_MapReduce_v2.pdf
  21. Rajiv Pandey, Manoj Dhoundiyal, “Quantitative Evaluation of Big Data Categorical Variables Through R”, http://www.sciencedirect.com
  22. ExplainingBigData.https://www.youtube.com/watchv=7D1CQ_LOizA
  23. Five big data challengesAnd how to overcome them with visual analyticshttp://www.sas.com/resources/asset/five-big-data- challenges-article.pdf
  24. Dr. R. Pandey, N. Srivastava, Dr. S. Fatima, Extending R Boxplot analysis to Big Data in Education
  25. J. Nandimath, A. Patil, E. Banerjee, P. Kakade, S. Vaidya, “Big Data Analysis Using Apache Hadoop”
  26. Leskovec, Rajarman and Ullman, “ Mining of Massive Datasets, Map-Reduce”, Stanford University.
Index Terms

Computer Science
Information Sciences

Keywords

Big Data Hadoop Cloud Computing