CFP last date
20 February 2025
Reseach Article

FAR: Dynamic Replication Strategy for Data Grid

Published on None 2011 by Kavita Jain, Varsha Wangikar, Seema Shah
journal_cover_thumbnail
International Conference and Workshop on Emerging Trends in Technology
Foundation of Computer Science USA
ICWET - Number 4
None 2011
Authors: Kavita Jain, Varsha Wangikar, Seema Shah
a4e6d9ed-31fb-4b9e-bf05-be8ca5f43241

Kavita Jain, Varsha Wangikar, Seema Shah . FAR: Dynamic Replication Strategy for Data Grid. International Conference and Workshop on Emerging Trends in Technology. ICWET, 4 (None 2011), 21-26.

@article{
author = { Kavita Jain, Varsha Wangikar, Seema Shah },
title = { FAR: Dynamic Replication Strategy for Data Grid },
journal = { International Conference and Workshop on Emerging Trends in Technology },
issue_date = { None 2011 },
volume = { ICWET },
number = { 4 },
month = { None },
year = { 2011 },
issn = 0975-8887,
pages = { 21-26 },
numpages = 6,
url = { /proceedings/icwet/number4/2083-algo71/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference and Workshop on Emerging Trends in Technology
%A Kavita Jain
%A Varsha Wangikar
%A Seema Shah
%T FAR: Dynamic Replication Strategy for Data Grid
%J International Conference and Workshop on Emerging Trends in Technology
%@ 0975-8887
%V ICWET
%N 4
%P 21-26
%D 2011
%I International Journal of Computer Applications
Abstract

Grid computing is gradually emerging as a new paradigm for next-generation computing. It enables sharing, selection, and aggregation of geographically distributed homogeneous and heterogeneous resources for solving large-scale problems in science, engineering, and commerce. In most of the organizations, there are large amounts of underutilized computing power and storage existing. Most desktop machine’s CPU and storage space usage is very less. On the other hand many high performance applications require large amount of computational power and storage space. Grid computing provides a framework for exploiting these underutilized resources and thus increases the efficiency of resource usage. Now a days, many commercial, business and research institutes produce huge amount of data and need to store this data on secondary storage of machines. Users of data are distributed among different geographical boundaries and they want to collaborate on the same problem. Data grids focus on providing secure access to distributed, heterogeneous pools of data. Data grids harness data, storage, and network resources located in distinct administrative domains, and provide high speed and reliable access to data. Optimization of data access can be achieved via data replication, whereby identical copies of data are generated and stored at various sites. A good replication strategy should ideally minimize mean execution time of all jobson Grid; reduce access time while optimizing resources. Hence in this paper we have focused on improving data grid performance. We have first presented a detailed analysis of various replication strategies like No replication, Always replication and the Economic model simulated by OptorSim. Next we have proposed a dynamic replication strategy which switches between No replication and Always replication based on file size and type of access. We argue that the proposed file size and type of access based replication algorithm will minimize the access latencies and execution time of jobs on Grid. In the next phase we shall proceed with implementing algorithm on simulator and observing the performance of the implemented algorithm.

References
  1. A Replica Management Service for High-Performance Data Grids; The Globus Data Management Group
  2. A. Chervenak, I. Foster, C. Kesselman, C. Salisbury, S. Tuecke, “The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets”, published in the Journal of Network and Computer Applications.
  3. Arcot Rajasekar, Michael Wan, Reagan Moore “ MySRB & SRB – Components of a Data Grid “
  4. Aur´elien Ortiz, Jacques Jorda and Abdelaziz M’zoughi “Toward a New Direction on Data Management in Grids”, 1-4244-0307-3/06/$20.00 ©2006 IEEE.
  5. C. A. Waldspurger, T. Hogg, B. A. Huberman, J. O. Kephart, and S. Stornetta. Spawn ” A Distributed Computational Economy” IEEE Transactions on Software Engineering, 18(2), 1992.
  6. C. Ernemann, V. Hamscher, and R. Yahyapour. “Economic Scheduling in Grid Computing”. In Job Scheduling Strategies for Paralel Processing, Edinburgh, Scotland, July 2002. Springer. LNCS 2537.
  7. Chervenak, A., Deelman, E., Foster, I., Guy, L., Hoschek, W., Iamnitchi, A., Kesselman, C., Kunst, P., Ripenu, M., Schwartzkopf, B., Stockinger, H., Stockinger, K. and Tierney, B., “Giggle:A Framework for Constructing Scalable Replica Location Services” SC'02: High Performance Networking and Computing, 2002.
  8. Chervenak, A.L., Palavalli, N., Bharathi, S., Kesselman, C. and Schwartzkopf, R., “Performance and Scalability of a Replica Location Service”, IEEE International Symposium on High Performance Distributed Computing, 2004.
  9. David Cameron, Ruben Carvajal-Schiaffino, Jamie Ferguson, A. Paul Millar, Caitriana Nicholson, Kurt Stockinger, Floriano Zini .OptorSim v2.1 Installation & User guide.
  10. Elizabeth Sherly, “Data Grid Architecture for a Distributed Data management System” Kerala Education Grid Perspective CSI Special Edition on Grid computing
  11. Ernest Sitholel, Gerard P. Parr', Sally I. McClean'; “Data Grid Performance Analysis through Study of Replication and Storage Infrastructure Parameters” 0-7803-9074-1/05/$20.00 ©2005 IEEE
  12. Foster, C. Kesselman (eds.),”The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.
  13. H. Stockinger, A. Samar, S. Muzaffar, F. Donno, “Grid data mirroring package (gdmp)”, J. Sci. Programming 10 (2) (2002).
  14. H. Stockinger, et al., La Jolla, alifornia,” Grid data management in action: experience in running and supporting data management services in the EU DataGrid Project”, Computing in High Energy Physics (CHEP 2003), March 24–28, 2003.
  15. I. Foster, C. Kesselman, “Globus: A Metacomputing Infrastructure Toolkit”, Intl. J. Supercomputer Applications, 11(2): 115-128, 1997.
  16. Ian Foster, Carl Kesselman, Steven Tuecke “The Anatomy of the Grid Enabling Scalable Virtual Organizations”, Intl J. Supercomputer Applications, 2001.
  17. K. Ranganathana and I. Foster. “Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications”, International. Symposium of High Performance Distributed Computing, Edinburgh, Scotland, July 2002.
  18. K. Ranganathana and I. Foster. “Identifying Dynamic Replication Strategies for a High Performance Data Grid”, Proceedings. of the Int. Grid Computing Workshop, Denver, Colorado, USA, November 2001.
  19. L. Guy, P. Kunszt, E. Laure, H. Stockinger, and K. Stockinger, ”Replica management in data grids” presented at Global Grid Forum 5, 2002.
  20. L. Guy, P. Kunszt, E. Laure, H. Stockinger, K. Stockinger, “Replica management in data grids”,Technical report, GGF5 Working Draft, July, 2002.
  21. N. Nisan, S. London, O. Regev, and N. Camiel. “Globally Distributed Computation over the Internet - the POPCORN Project” In Proc. Of the International Conference on Distributed Computing Systems (ICDCS’98).
  22. P. Kunszt, E. Laure, H. Stockinger, K. Stockinger, “Advanced replica management with Reptor”, Proceedings of the International Conference on Parallel Processing and Applied Mathematics, Czestochowa, Poland, September 7–10, 2003.
  23. Peter Kunszta,*, Erwin Laurea, Heinz Stockingera, Kurt Stockingerb “File-based replica management”,future Generation Computer Systems 21 (2005) 115–123 0167-739X/$ – see front matter © 2004 Elsevier B.V.
  24. R. Buyya, H. Stockinger, J. Giddy, and D. Abramson. “Economic Models for Management of Resources in Peer-to-Peer and Grid Computing”, In SPIE’s Int. Symposium on the Convergence of InformationTechnologies and Communications (ITCom 2001), Denver, CO, USA, August 2001.
  25. R. Buyya, R. Murshed, and D. Abramson. “A Deadline and Budget Constrained Cost-Time Optimization Algorithm for Scheduling Task Farming Applications on Global Grids” In Int. Conf. on Parallel and Distributed Processing Techniques and Applications, Las Vegas, NV, USA, June 2002.
  26. S. Shah, “Public Resource Computing (PRC): Does Volunteering Impact Donors?” at International Conference on Emerging trends in High Performing Architecture Algorithms and Computing, HIPAAC 2007
  27. S.Shah, S. Mahajan, “Replica Management Scheme for “Gnyan Setu” – A Proposed Campus Grid for Collaborative Knowledge Management “,International Conference on Information Processing ICIP 2007
  28. Sang-Min Park Jai-Hoon Kim, Chameleon “A Resource Scheduler in A Data Grid Environment; Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID.03)
  29. William H. Bell1, David G. Cameron1, Ruben Carvajal-Schiaffino2, A. Paul Millar1, Kurt Stockinger3, Floriano Zini,”Evaluation of an Economy-Based File Replication Strategy for a Data Grid”,Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID.03)
Index Terms

Computer Science
Information Sciences

Keywords

Grid computing Data grid Data Replication Simulator FAR algorithm