CFP last date
20 January 2025
Reseach Article

A Comparison between Different Checkpoint Schemes with Advantages and Disadvantages

Published on April 2014 by Manoj Kumar, Abhishek Choudhary, Vikas Kumar
National Seminar on Recent Advances in Wireless Networks and Communications
Foundation of Computer Science USA
NWNC - Number 3
April 2014
Authors: Manoj Kumar, Abhishek Choudhary, Vikas Kumar
2d7f260f-f25b-4227-8a38-17ce23e13fa0

Manoj Kumar, Abhishek Choudhary, Vikas Kumar . A Comparison between Different Checkpoint Schemes with Advantages and Disadvantages. National Seminar on Recent Advances in Wireless Networks and Communications. NWNC, 3 (April 2014), 36-39.

@article{
author = { Manoj Kumar, Abhishek Choudhary, Vikas Kumar },
title = { A Comparison between Different Checkpoint Schemes with Advantages and Disadvantages },
journal = { National Seminar on Recent Advances in Wireless Networks and Communications },
issue_date = { April 2014 },
volume = { NWNC },
number = { 3 },
month = { April },
year = { 2014 },
issn = 0975-8887,
pages = { 36-39 },
numpages = 4,
url = { /proceedings/nwnc/number3/16129-1440/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Seminar on Recent Advances in Wireless Networks and Communications
%A Manoj Kumar
%A Abhishek Choudhary
%A Vikas Kumar
%T A Comparison between Different Checkpoint Schemes with Advantages and Disadvantages
%J National Seminar on Recent Advances in Wireless Networks and Communications
%@ 0975-8887
%V NWNC
%N 3
%P 36-39
%D 2014
%I International Journal of Computer Applications
Abstract

It is known that check pointing and rollback recovery are widely used techniques that allow a distributed computing to progress in spite of a failure. There are two fundamental approaches for check pointing and recovery. One is asynchronous approach, process take their checkpoints independently. So, taking checkpoints is very simple but due to absence of a recent consistent global checkpoint which may cause a rollback of computation. Synchronous check pointing approach assumes that a single process other than the application process invokes the check pointing algorithm periodically to determine a consistent global checkpoint. Various flavors of these two techniques, their mechanisms, advantages and drawbacks have been discussed in detail. Besides an exhaustive study of the implementation issues are also included. Lastly, some open issues have been addressed and certain solutions have been proposed by the author.

References
  1. R. D. Schlichting and F. B. Schneider, "Fail-stop processors: an approach to designing fault-tolerant distributed computing systems", ACM Transactions on Computer Systems, 1 (1985).
  2. H. F. Li, Z. Wei and D. Goswami, "Quasi-atomic recovery for distributed agents", Parallel Computing, 32 (2006).
  3. Y. Luo and D. Manivannan, "FINE: A Fully Informed aNd Efficient communication-induced check pointing protocol for distributed systems", J. Parallel Distrib. Comput. , 69 (2009).
  4. J. T. Rough and A. M. Goscinski, "The development of an efficient check pointing facility exploiting operating systems services of the GENESIS cluster operating system", Future Generation Computer Systems, 20, 4 (2004).
  5. Bhargava, B. and Shu-Renn, L. ,"Independent Check pointing and Concurrent rollback for recovery in distributed Systems-an optimistic approach",n proceedings of The 17th Symposium on Reliable Distributed Systems, pp. 3-12. Columbus, USA, October 1988.
  6. Partha Sarathi Mandel, Krishnendu Mukhopadhaya, " Performance analysis of different check pointing and recovery schemes using stochastic model" Journal of Parallel and Distributed Computing , 66(1), pp. 99-107, January 2006
  7. Y. Manable. "A Distributed Consistent Global Checkpoint Algorithm with minimum number of Checkpoints", Technical Report of IEICE, COMP97-6 April, 1997
  8. S. Monnet, C. Morin, R. Badrinath, "Hybrid check pointing for Parllel Applications in Cluster Federations", In 4th IEEE/ ACM International Symposium on Cluster Computing and the Grid, Chicago, IL, USA, pp. 773-782, April 2004
  9. P. A. Lee and T. Anderson, Fault Tolerance: Principles and Practice. Springer-Verlag/Wien, 1990.
  10. A. Duda. (1983): The effects of check pointing on program execution time. Information Processing Letters, 16, pp. 221-229.
Index Terms

Computer Science
Information Sciences

Keywords

Recovery Rollback Recovery Synchronous Asynchronous Checkpoint Cp.