Evaluating Checkpoint Interval for Fault-Tolerance in MapReduce - Citegraph

Paper Info

Title
Evaluating Checkpoint Interval for Fault-Tolerance in MapReduce

Abstract
MapReduce is the efficient framework for parallel processing of distributed big data in cluster environment. In such a cluster, task failures can impact on performance of applications. Although MapReduce automatically reschedules the failed tasks, it takes long completion time because it starts from scratch. The checkpointing mechanism is the valuable technique to avoid re-execution of finished tasks in MapReduce. However, defining incorrect checkpoint interval can still decrease the performance of MapReduce applications and job completion time. So, in this paper, checkpoint interval is proposed to avoid re-execution of whole tasks in case of task failures and save job completion time. The proposed checkpoint interval is based on five parameters: expected job completion time without checkpointing, checkpoint overhead time, rework time, down time and restart time. The experiments show that the proposed checkpoint interval takes the advantage of less checkpoints overhead and reduce completion time at failure time.

Year	DOI	Venue
2018	10.1109/CyberC.2018.00046	2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)
Keywords	Field	DocType
Task analysis,Checkpointing,Fault tolerance,Fault tolerant systems,Computational modeling,Big Data,Google	Rework,Scratch,Task analysis,Computer science,Parallel processing,Real-time computing,Fault tolerance,Downtime,Big data,Distributed computing	Conference
ISSN	ISBN	Citations
2475-7020	978-1-7281-0974-9	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Naychi Nway Nway	1	0	0.34
Julia Myint	2	0	0.34
Ei Chaw Htoon	3	0	0.34

1