CFP last date
20 January 2025
Reseach Article

Parallel Data Processing for Effective Dynamic Resource Allocation in the Cloud

by K. Krishna Jyothi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 70 - Number 22
Year of Publication: 2013
Authors: K. Krishna Jyothi
10.5120/12196-7913

K. Krishna Jyothi . Parallel Data Processing for Effective Dynamic Resource Allocation in the Cloud. International Journal of Computer Applications. 70, 22 ( May 2013), 1-4. DOI=10.5120/12196-7913

@article{ 10.5120/12196-7913,
author = { K. Krishna Jyothi },
title = { Parallel Data Processing for Effective Dynamic Resource Allocation in the Cloud },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 70 },
number = { 22 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-4 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume70/number22/12196-7913/ },
doi = { 10.5120/12196-7913 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:33:30.563815+05:30
%A K. Krishna Jyothi
%T Parallel Data Processing for Effective Dynamic Resource Allocation in the Cloud
%J International Journal of Computer Applications
%@ 0975-8887
%V 70
%N 22
%P 1-4
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Parallel data processing has become more and more reliable phenomenon due to the realization of could computing, especially using IaaS (Infrastructure as a Service) clouds. The cloud service providers such as IBM, Google, Microsoft and Oracle have made provisions for parallel data processing in their cloud services. Nevertheless, the frameworks used as of now are static and homogenous in nature in a cluster environment. The problem with these frameworks is that the resource allocation when large jobs are submitted is not efficient as they take more time for processing besides incurring more cost. In this paper we discuss the possibilities of parallel processing and its challenges. One of the IaaS products meant for parallel processing is presented in this paper. VMs are allocated to tasks dynamically for execution of jobs. With proposed framework we performed parallel job processing which involves Map Reduce, a new programming phenomenon. We also compare this with Hadoop.

References
  1. R. Chaiken, B. Jenkins, P. -A. Larson, B. Ramsey, D. Shakib, S. Weaver, and J. Zhou. SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets. Proc. VLDB Endow, 1(2):1265– 1276, 2008.
  2. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, pages 10–10, Berkeley, CA, USA, 2004. USENIX Association.
  3. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks. In EuroSys '07: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, pages 59–72, New York, NY, USA, 2007. ACM.
  4. H. chih Yang, A. Dasdan, R. -L. Hsiao, and D. S. Parker. Map- Reduce-Merge: Simplified Relational Data Processingon Large Clusters. In SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1029–1040, New York, NY, USA, 2007. ACM.
  5. I. Raicu, I. Foster, and Y. Zhao. Many-Task Computing for Grids and Supercomputers. In Many-Task Computing on Grids and Supercomputers, 2008. MTAGS 2008. Workshop on, pages 1–11, Nov. 2008.
  6. Amazon Web Services LLC. Amazon Elastic Compute Cloud (Amazon EC2). http://aws. amazon. com/ec2/, 2009.
  7. The Apache Software Foundation. Welcome to Hadoop! http: //hadoop. apache. org/, 2009.
  8. T. White. Hadoop: The Definitive Guide. O'Reilly Media, 2009.
  9. Amazon Web Services LLC. Amazon Elastic MapReduce. http: //aws. amazon. com/elasticmapreduce/, 2009.
  10. D. Warneke and O. Kao. Nephele: Efficient Parallel Data Processing in the Cloud. In MTAGS '09: Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, pages 1–10, New York, NY, USA, 2009. ACM.
  11. R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the Data: Parallel Analysis with Sawzall. Sci. Program. , 13(4):277– 298, 2005.
  12. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A Not-So-Foreign Language for Data Processing. In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1099–1110, New York, NY, USA, 2008. ACM.
  13. E. Deelman, G. Singh, M. -H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz. Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems. Sci. Program. , 13(3):219–237, 2005.
  14. J. Frey, T. Tannenbaum, M. Livny, I. Foster, and S. Tuecke. Condor- G: A Computation Management Agent for Multi-Institutional Grids. Cluster Computing, 5(3):237–246, 2002.
  15. Y. Zhao, M. Hategan, B. Clifford, I. Foster, G. von Laszewski, V. Nefedova, I. Raicu, T. Stef-Praun, and M. Wilde. Swift: Fast, Reliable, Loosely Coupled Parallel Computation. In Services, 2007 IEEE Congress on, pages 199–206, July 2007.
  16. I. Foster and C. Kesselman. Globus: A Metacomputing Infrastructure Toolkit. Intl. Journal of Supercomputer Applications, 11(2):115– 128, 1997.
  17. I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, and M. Wilde. Falkon: a Fast and Light-weight tasK executiON framework. In SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing, pages 1–12, New York, NY, USA, 2007. ACM.
  18. G. von Laszewski, M. Hategan, and D. Kodeboyina. Workflows for e-Science Scientific Workflows for Grids. Springer, 2007.
  19. T. Dornemann, E. Juhnke, and B. Freisleben. On-Demand Resource Provisioning for BPEL Workflows Using Amazon's Elastic Compute Cloud. In CCGRID '09: Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pages 140–147, Washington, DC, USA, 2009. IEEE Computer Society.
  20. L. Ramakrishnan, C. Koelbel, Y. -S. Kee, R. Wolski, D. Nurmi, D. Gannon, G. Obertelli, A. YarKhan, A. Mandal, T. M. Huang, K. Thyagaraja, and D. Zagorodnov. VGrADS: Enabling e-Science Workflows on Grids and Clouds with Fault Tolerance. In SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1–12, New York, NY,USA,2009. AC Technical Report on an Elastic Utility Computing Architecture Linking Your Programs to Useful Systems. Tec
  21. D. Wentzlaff, C. G. III, N. Beckmann, K. Modzelewski, A. Belay, L. Youseff, J. Miller, and A. Agarwal. An Operating System for Multicore and Clouds: Mechanisms and Implementation. In SoCC '10: Proceedings of the ACM Symposium on Cloud Computing 2010, pages 3–14, New York, NY, USA, 2010. ACM.
  22. D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and D. Zagorodnov. Eucalyptus: A
  23. R. Russell. virtio: Towards a De-Facto Standard for Virtual I/O Devices. SIGOPS Oper. Syst. Rev. , 42(5):95–103, 2008
Index Terms

Computer Science
Information Sciences

Keywords

Parallel processing cloud computing Map Reduce many-task computing