A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication

S. Veerapandi; S. Gavaskar; A. Sumithra

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Global Positioning System for Object Tracking

January

2015

Case Tool: Fast Interconnections with New 3-Disjoint Paths MIN Simulation Module

April

2011

Using Clustering Approach Privacy Preserving Update to Anonymous and Confidential Databases

April

2015

Graphical Analysis of Kampe De Feriet’s Series with Implementation of MATLAB

December

2012

Reseach Article

A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication

by S. Veerapandi, S. Gavaskar, A. Sumithra

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 157 - Number 1

Year of Publication: 2017

Authors: S. Veerapandi, S. Gavaskar, A. Sumithra

10.5120/ijca2017912614

S. Veerapandi, S. Gavaskar, A. Sumithra . A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication. International Journal of Computer Applications. 157, 1 ( Jan 2017), 43-48. DOI=10.5120/ijca2017912614

@article{ 10.5120/ijca2017912614,

author = { S. Veerapandi, S. Gavaskar, A. Sumithra },

title = { A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication },

journal = { International Journal of Computer Applications },

issue_date = { Jan 2017 },

volume = { 157 },

number = { 1 },

month = { Jan },

year = { 2017 },

issn = { 0975-8887 },

pages = { 43-48 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume157/number1/26799-2017912614/ },

doi = { 10.5120/ijca2017912614 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:02:48.653717+05:30

%A S. Veerapandi

%A S. Gavaskar

%A A. Sumithra

%T A Hybrid Fault Tolerance System for Distributed Environment using Check Point Mechanism and Replication

%J International Journal of Computer Applications

%@ 0975-8887

%V 157

%N 1

%P 43-48

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Managing the distributed environment against the failures plays an important role nowadays. There are so many techniques evolved so far and each have their own merit and demerit. The efficiency of the algorithm depends on how much replication is done and upto what extent the fault tolerance has been achieved. We have here proposed a new method which uses both check point as well as the replication to ensure consistency in the distributed environment. Our method is also easy to implement.

References

M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, G. Alonso,“ Understanding Replication in Databases and Distributed Systems,” Research supported by EPFLETHZ DRAGON project and OFES).
M. Herlihy and J. Wing. “Linearizability: a correctness condition for concurrent objects,” ACM Trans. on Progr. Languages and Syst., 12(3):463-492, 1990. (IJIDCS) International Journal on Internet and Distributed Computing Systems. Vol: 1 No: 1, 39
M. Ahamad, P.W. Hutto, G. Neiger, J.E. Burns, and P. Kohli., “Causal Memory:Definitions, implementations and Programming,” TR GIT-CC-93/55, Georgia In-stitute of Technology, July 94.
H.P. Reiser, M.J. Danel, and F.J. Hauck., “ A flexible replication framework for scalable andreliable .net services.,” In Proc. of the IADIS Int. Conf. on Applied Computing, volume1, pages 161–169, 2005.
A. Kale, U. Bharambe, “Highly available fault tolerant distributed computing using reflection and replication,” Proceedings of the International Conference on Advances in Computing, Communication and Control ,Mumbai, India Pages: 251-256 ,: 2009
X. China, “Token-Based Sequential Consistency in Asynchronous Distributed System ,” 17 th Internaional Conference on Advanced Information Networking and Applications (AINA'03),March 27-29, ISBN: 0-7695- 1906-7
A. Shye, , J. Blomstedt, , T. Moseley,V. Reddi, , and Daniel A. Connors, “PLR: A Software Approach to Transient Fault Tolerance for Multicore Architectures” Pp135-148.
V. Agarwal, Fault Tolerance in Distributed Systems, I. Institute of Technology Kanpur, www.cse.iitk.ac.in/report-repository, 2004. ,
H. Jung, D. Shin, H. Kim, and Heon Y. Lee, “Design and Implementation of Multiple FaultTolerant MPI over Myrinet (M3) ,” SC|05 Nov 1218,2005, Seattle, Washington, USA Copyright 2005 ACM.
M. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message passing systems. Technical Report CMU-CS-96-81, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, October 1996.
L. Alvisi and K. Marzullo. Message logging : Pessimistic, optimistic, and causal. In Proceedings of the 15th International Conference on Distributed Computing,Systems (ICDCS 1995), pages ,229–236. IEEE CS Press, May-June 1995.
J. Walters and V. Chaudhary,” Replication-Based Fault Tolerance for MPI Applications,” Ieee Transactions On Parallel And Distributed Systems, Vol. 20, No. 7, July 2009
M Chtepen, F.. Claeys, B. Dhoedt, , and P. Vanrolleghem,” Adaptive Task Checkpointing and Replication:Toward Efficient Fault-Tolerant Grids”, IEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 2, Feb 2009
S. Jafar, A. Krings, and T. Gautier,” Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing”, IEEE Transactions On Dependable and Secure Computing, Vol. 6, No. 1, Jan-Mar 2009
X. Yang, Y. Du, Panfeng W. Fu, and Jia “FTPA: Supporting Fault-Tolerant Parallel Computing through Parallel Recomputing,” Ieee Transactions On Parallel And Distributed Systems, Vol. 20, No. 10, October 2009
S. Gorender, and M Raynal, “An Adaptive Programming Model for Fault-Tolerant Distributed Computing” Ieee Transactions On Dependable And Secure Computing, Vol. 4, No. 1, January-March 2007.
A. Luckow B. Schnor, „“Adaptive Checkpoint Replication for Supporting the Fault Tolerance of Applications in the Grid,“ Seventh IEEE International Symposium on Network Computing and Applications, 2008 IEEE.
A. Bouteiller, F. Cappello, T. H Krawezik, Pi Lemarinier, F Magniette, “MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging, ” SC’03, NoV 15-21, 2003, Phoenix, Arizona, USA Copyright 2003 ACM 1-58113-695- 1/03/001
I. Saha, D. Mukhopadhyay and S. Banerjee, “Designing Reliable Architecture For Stateful Fault Tolerance,” Proceedings of the Seventh International Conference on Parallel and Distributed Computing,Applications and Technologies (PDCAT'06) 2006 .
N. Gorde, S. Aggarwal, “A Fault Tolerance Scheme for Hierarchical Dynamic Schedulers in Grid” International Conference on Parallel Processing Workshops, 2008 IEEE
Y. Li, , Z. Lan, , P. Gujrati and , X. Sun, , “Fault- AwareRuntime Strategies for High-Performance Computing,” IEEE Transactions on Parallel And Distributed Systems, Vol. 20, No. 4, April 2009
G. Jakadeesan, D. Goswami, “A Classification-Based Approach to Fault-Tolerance Support in Parallel Programs”, International Conference on Parallel and Distributed Computing, Applications and Technologies, 2009 IEEE.
D.K. Gifford, “Weighted voting for replicated data,” In SOSP ’79: Proc. of the seventh ACM symposium on Operating systems principles, pages 150–162, 1979.
J. Osrael, L. Froihofer, K.M. Goeschka, S. Beyer,P. Gald´amez, , and F. Mu˜noz. “A system architecture for enhanced availability of tightly coupled distributed systems,” In Proc. of 1st Int. Conf. on Availability, Reliability, and Security.IEEE, 2006
J Maccormick1, C Thekkath, M.Jager,K. Roomp, and L. Peterson , “Niobe: A Practical Replication Protocol.” ACM Journal Name, Vol. V, No. N, Month 20YY.
Cao Huaihu, Zhu Jianming, “An Adaptive Replicas Creation Algorithm with Fault Tolerance in the Distributed Storage Network” 2008 IEEE..
N. Budhiraja, K. Marzullo, F.B. Schneider, and S. Toueg. The Primary-Backup Approach. In Sape Mullender, editor, Distributed Systems, pages 199-216. ACM Press, 1993.
V.K Garg,. “Implementing fault-tolerant services using fused state machines,” Tech-nical Report ECE-PDS-2010- 001, Parallel and Distributed Systems Laboratory,ECE Dept. University of Texas at Austin (2010).
N. Xiong, M. Cao, J. He and L. Shu, “A Survey on Faulttolerance in Distributed Network Systems,” 2009 International Conference on Computational Science, 978- 0-7695-3823-5/09
D. Tian , K. Wu X. Li, “A Novel Adaptive Failure Detector for Distributed Systems,” Proceedings of the 2008 International Conference on Networking, Architecture, and Storage ©2008 , ISBN: 978-0-7695- 3187-8

Index Terms

Computer Science

Information Sciences

Keywords

FTPA PLR GiFT