CFP last date
20 January 2025
Reseach Article

Parallel Data Processing in the Cloud using Nephele

by Mayura D. Tapkire, B. M. Patil, V. M. Chandode
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 69 - Number 17
Year of Publication: 2013
Authors: Mayura D. Tapkire, B. M. Patil, V. M. Chandode
10.5120/12060-7527

Mayura D. Tapkire, B. M. Patil, V. M. Chandode . Parallel Data Processing in the Cloud using Nephele. International Journal of Computer Applications. 69, 17 ( May 2013), 1-8. DOI=10.5120/12060-7527

@article{ 10.5120/12060-7527,
author = { Mayura D. Tapkire, B. M. Patil, V. M. Chandode },
title = { Parallel Data Processing in the Cloud using Nephele },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 69 },
number = { 17 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-8 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume69/number17/12060-7527/ },
doi = { 10.5120/12060-7527 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:30:29.558103+05:30
%A Mayura D. Tapkire
%A B. M. Patil
%A V. M. Chandode
%T Parallel Data Processing in the Cloud using Nephele
%J International Journal of Computer Applications
%@ 0975-8887
%V 69
%N 17
%P 1-8
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In recent years, Infrastructure-as-a-Service (IaaS) clouds have become increasingly popular as a flexible and inexpensive platform for ad-hoc parallel data processing. Major players in cloud computing have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. However, currently used processing frameworks have been designed for static, homogeneous cluster systems and do not support the new features which distinguish the cloud platform. In this paper discussion is being done on the research project Nephele. Nephele is the first data processing framework to explicitly exploit the dynamic resource allocation offered by today's IaaS clouds for both, task scheduling and execution. First performance results of Nephele are presented and its efficiency is compared with one of the well-known software, MapReduce. MapReduce is chosen for comparison since it is open source software and currently enjoys high popularity in the data processing community.

References
  1. D. Warneke and O. Kao, "Exploiting dynamic resource allocation for efficient parallel data processing in the cloud," IEEE Transactions on Parallel and Distributed Systems, January 2011.
  2. Amazon web services LLC. Amazon ElasticMapReduce. http://aws. amazon. com/de/elasticmapreduce/, 2011
  3. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," in proceedings of the sixth International Symposium on Operating Systems Design & Implementation, Berkeley, CA,USA, pp. 10–10, 2004.
  4. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters,"Communications of the ACM, vol. 51, pp. 107–113, 2008.
  5. E. Deelman, G. Singh, M. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. Jacob, and D. Katz, "Pegasus: A framework for mapping complex scientific workflows onto distributed systems," Scientific Programming Journal, vol. 13(3), pp. 219–237, 2005.
  6. J. Frey, T. Tannenbaum, M. Livny, I. Foster and S. Tuecke, "Condor-G: a computation management agent for multi-institutional grids," journal of Cluster Computing, vol. 5 (3), pp. 237–246, 2002.
  7. Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. Laszewski, V. Nefedova, I. Raicu, T. Stef-Praun, and M. Wilde, "Swift: fast, reliable, loosely coupled parallel computation," in proceedings of the IEEE Congress on Services, pp. 199–206, 2007.
  8. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly, "Dryad: distributed data parallel programs from sequential building blocks," in proceedings of the second ACM SIGOPS/EuroSys European Conference on Computer Systems, New York, USA, pp. 59–72, 2007.
  9. I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, and M. Wilde, "Falkon: a fast and light weight task execution framework," proceedings of the ACM/IEEE conference on Supercomputing, New York, USA, pp. 1–12, 2007.
  10. T. Dornemann, E. Juhnke, and B. Freisleben. "On-Demand Resource Provisioning for BPEL Workflows Using Amazon's Elastic Compute Cloud. " In CCGRID '09: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pages 140–147, Washington, DC,USA,2009. IEEE Computer Society.
  11. L. Ramakrishnan, C. Koelbel, Y. Kee, R. Wolski, D. Nurmi, D. Gannon, G. Obertelli, A. YarKhan, A. Mandal, T. Huang, K. Thyagaraja, and D. Zagorodnov, "VGrADS: enabling e-Science workflows on grids and clouds with fault tolerance," in proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, USA, pp. 1–12, 2009.
  12. V. Borkar, M. Carey, R. Grover, N. Onose, and R. Vernica, "Hyracks: A flexible and extensible foundation for data-intensive computing," in proceedings of the twenty seventh IEEE international conference on Data Engineering, Washington, DC, USA, pp. 1151–1162, 2011.
  13. V. Kumar and S. Palaniswami, "A dynamic resource allocation method for parallel data processing in cloud computing," Journal of Computer Science, pp. 780-788, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Cloud computing parallel data processing Nephele MapReduce