Title
Evaluating Checkpoint Interval for Fault-Tolerance in MapReduce
Abstract
MapReduce is the efficient framework for parallel processing of distributed big data in cluster environment. In such a cluster, task failures can impact on performance of applications. Although MapReduce automatically reschedules the failed tasks, it takes long completion time because it starts from scratch. The checkpointing mechanism is the valuable technique to avoid re-execution of finished tasks in MapReduce. However, defining incorrect checkpoint interval can still decrease the performance of MapReduce applications and job completion time. So, in this paper, checkpoint interval is proposed to avoid re-execution of whole tasks in case of task failures and save job completion time. The proposed checkpoint interval is based on five parameters: expected job completion time without checkpointing, checkpoint overhead time, rework time, down time and restart time. The experiments show that the proposed checkpoint interval takes the advantage of less checkpoints overhead and reduce completion time at failure time.
Year
DOI
Venue
2018
10.1109/CyberC.2018.00046
2018 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC)
Keywords
Field
DocType
Task analysis,Checkpointing,Fault tolerance,Fault tolerant systems,Computational modeling,Big Data,Google
Rework,Scratch,Task analysis,Computer science,Parallel processing,Real-time computing,Fault tolerance,Downtime,Big data,Distributed computing
Conference
ISSN
ISBN
Citations 
2475-7020
978-1-7281-0974-9
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Naychi Nway Nway100.34
Julia Myint200.34
Ei Chaw Htoon300.34