We apologize for a recent technical issue with our email system, which temporarily affected account activations. Accounts have now been activated. Authors may proceed with paper submissions. PhDFocusTM
CFP last date
20 November 2024
Reseach Article

A Damage Assessment Model in a Distributed System

by Parimal Kumar Giri, Satya Ranjan Mohapatra
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 68 - Number 23
Year of Publication: 2013
Authors: Parimal Kumar Giri, Satya Ranjan Mohapatra
10.5120/11717-7277

Parimal Kumar Giri, Satya Ranjan Mohapatra . A Damage Assessment Model in a Distributed System. International Journal of Computer Applications. 68, 23 ( April 2013), 6-12. DOI=10.5120/11717-7277

@article{ 10.5120/11717-7277,
author = { Parimal Kumar Giri, Satya Ranjan Mohapatra },
title = { A Damage Assessment Model in a Distributed System },
journal = { International Journal of Computer Applications },
issue_date = { April 2013 },
volume = { 68 },
number = { 23 },
month = { April },
year = { 2013 },
issn = { 0975-8887 },
pages = { 6-12 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume68/number23/11717-7277/ },
doi = { 10.5120/11717-7277 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:28:57.431841+05:30
%A Parimal Kumar Giri
%A Satya Ranjan Mohapatra
%T A Damage Assessment Model in a Distributed System
%J International Journal of Computer Applications
%@ 0975-8887
%V 68
%N 23
%P 6-12
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Error propagation is unavoidable due to imperfect detection mechanisms and random inter-process communications; it could give rise to contaminated checkpoints, which, in turn, result in unsuccessful rollbacks. To encounter the problem of error propagation, a damage assessment model is discussed to optimize the correctness of saved checkpoints under various circumstances. The algorithm is based on an equivalence classes between pairs of successive checkpoints of a process which allows us, in some cases, to advance the recovery line of the computation without forcing checkpoints in other processes. This is well-suited for autonomous and heterogeneous environments, where each process does not know any private information about other processes and private information of the same type of distinct processes is not related.

References
  1. R. Koo and S. Toueg, “Check-pointing and Rollback-Recovery for Distributed Systems”, IEEE trans. Software Engineering, vol. SE-13, no. 1, pp. 23-31, Jan 1987.
  2. Chandy K. M. and Lamport L., “Distributed Snapshots: Determining Global State of Systems,” ACM Transaction on Computing Systems, vol. 3, No. 1, pp. 63-75, February 1985.
  3. R. Baldoni, J. Brezinsky, J.M. Helary, A. Mostefaoui and M. Raynal, On Modeling Consistent Checkpoints and the Domino Effect in Distributed Systems, Proc. IEEE Int. Conference on Future Trends in Distributed Computing Systems,1995, pp.314–323.
  4. Pradhan D.K., Krishana P.P. and Vaidya N.H., “Recovery in Mobile Wireless Environment: Design and Trade-off Analysis,”Proceedings 26th International Symposium on Fault-Tolerant Computing, pp. 16-25, 1996.
  5. E.N. Elnozahy, D.B. Johnson and Y.M. Wang, A Survey of Rollback-Recovery Protocols in Message-Passing Systems,Technical Report No.CMU-CS-96-181, School of Computer Science, Carnegie Mellon University, 1996.
  6. Y.M. Wang, Consistent Global Checkpoints that Contains a Set of Local Checkpoints, IEEE Transactions on Computers,vol. 46, no. 4, 1997, pp. 456-468.
  7. Proceedings of International Conference on Parallel Processing, pp. 37-44, August 1998.
  8. R. Baldoni, J. Brezinsky, J.M. Helary, “A. Mostefaoui and M. Raynal, On Modeling Consistent Checkpoints and the Domino Effect in Distributed Systems”, Proc. IEEE Int. Conference on Future Trends in Distributed Computing Systems,1995, pp.314–323.
  9. E.N. Elnozahy, D.B. Johnson and Y.M. Wang, “A Survey of Rollback-Recovery Protocols in Message-Passing Systems”, Technical Report No.CMU-CS-96-181, School of Computer Science, Carnegie Mellon University, 1996.
  10. Y.M. Wang, “Consistent Global Checkpoints that Contains a Set of Local Checkpoints”, IEEE Transactions on Computers,vol. 46, no. 4, 1997, pp. 456-468.
  11. W. Wagealla, T. Osman, and A. Bargiela, “Error detection algorithm for agent-based distributed applications,” in Proc. 2nd Workshop Agent-Based Simulation, Passau, Germany, 2001, pp. 106–110.
  12. G. Cao and M. Singhal,”Mutable Checkpoints:A New Check-pointing Approach for Mobile Computing Systems”, IEEE Transactions On Parallel And D istributed Systems,Vol.12,No.2,February 2001,pp 157-172.
  13. Tsai, J., and Lin, J. W.,“On the Fully-Informed Communication-Induced Check pointing Protocol, “Proceedings of 11th Pacific Rim International Symposium on Dependable Computing, Changsha, Hunan, PRC, 2005,.
  14. R. Prakash and M. Singhal. “Low-Cost Check-pointing and Failure Recovery in Mobile Computing Systems”. IEEE Trans. on Parallel and Distributed System, pages 1035-1048,Oct. 1996.
  15. Elnozahy E.N., Alvisi L., Wang Y.M. and Johnson D.B., “A Survey of Rollback-Recovery Protocols in Message-Passing Systems,”ACM Computing Surveys, vol. 34, no. 3, pp. 375-408, 2002.
  16. G. Cao and M. Singhal. “On impossibility of Min-Process and Non-Blocking Check-pointing and An Efficient Check-pointing algorithm for mobile computing Systems”. OSU Technical Report #OSU-CISRC-9/97-TR44, 1997.
  17. Bidyut Gupta, S.Rahimi and Z.Lui. “A New High Performance Check-pointing Approach for Mobile Computing Systems”. IJCSNS International Journal of Computer Science and Network Security, Vol.6 No.5B, May 2006.
  18. Acharya A. and Badrinath B. R., “Check-pointing Distributed Applications on Mobile Computers,” Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems, pp. 73-80, September 1994.
  19. Nuno Neves and W. Kent Fuchs. “Adaptive Recovery for Mobile Environments”, in Proc .IEEE High-Assurance Systems Engineering Workshop, October 21-22, 1996, pp.134-141.
  20. Y. Manable. “A Distributed Consistent Global Checkpoint Algorithm with minimum number of Checkpoints”. Technical Report of IEICE, COMP97-6(April1997)
  21. J.L.Kim and T.Park. “An efficient protocol for Check-pointing recovery in Distributed Systems” IEEE Transaction On Parallel and Distributed Systems,4(8):pp.955-960, Aug 1993.
  22. Yanping Gao, Changhui Deng, Yandong Che. “An Adaptive Index-Based Algorithm using Time-Coordination in Mobile Computing”.International Symposiums on Information Processing, 2008.
  23. Kanmani - Anitha - Ganesan.“Coordinated Check-pointing with Avalanche Avoidance for Distributed Mobile Computing System.”International Conference on Computational Intelligence and Multimedia Applications 2007.
  24. W. Wagealla, T. Osman, and A. Bargiela,“Error detection algorithm for agent-based distributed applications,” in Proc. 2nd Workshop Agent-Based Simulation, Passau, Germany, 2001, pp. 106–110.
  25. Sani Tripathy and Brajendra Panda,” Post-Intrusion Recovery Using Data Dependency Approach “,Proceedings of IEEE Workshop on Information Assurance and Security United States Military Academy, West Point, NY, 5-6 June, 2001.
  26. B. Gupta, S.K. Banerjee and B. Liu, “Design of new roll-forward recovery approach for distributed systems”, IEE Proc. Computers and Digital Techniques, Volume 149, Issue 3, pp. 105-112,May 2002.
  27. D. Manivannan, and M. Singhal, “Asynchronous recovery without using vector timestamps”, Journal of Parallel and Distributed Computing, Volume 62,Issue 62 pp. 1695-1728, Dec 2002.
  28. M. Ohara., M. Arai., S. Fukumoto., and K. Iwasaki.,”Finding a Recovery Line in Uncoordinated Check-pointing”, Proceedings 24th International Conference on Distributed Computing Systems Workshops (ICDCSW’04), pp. 628 – 633, 2004.
  29. B. Gupta, Y. Yang, S. Rahimi, and A. Vemuri, “A High-Performance Recovery Algorithm for Distributed Systems”, Proc. 21st International Conference on Computers and Their Applications, pp. 283-288, Seattle, March 2006.
  30. S. Monnet, C. Morin, R. Badrinath, “Hybrid Check-pointing for Parallel Applications in cluster Federations”, Proc. 4th IEEE/ACM International Symposium on Cluster Computing and the Grid, Chicago, IL, USA, pp. 773-782, April 2004.
  31. J. Cao, Y. Chen, K. Zhang and Y. He,“Check-pointing in Hybrid Distributred Systems”,Proc.7th International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN’04), pp. 136-141, May 2004.
Index Terms

Computer Science
Information Sciences

Keywords

Contaminated Check-points Error Propagation Nonlinear Integer Programming Damage Assessment Model Global Checkpoint Rollback Recovery Index-based Check-pointing