Title
Visual, Log-Based Causal Tracing for Performance Debugging of MapReduce Systems
Abstract
The distributed nature and large scale of MapReduce programs and systems poses two challenges in using existing profiling and debugging tools to understand MapReduce programs. Existing tools produce too much information because of the large scale of MapReduce programs, and they do not expose program behaviors in terms of Maps and Reduces. We have developed a novel non-intrusive log-analysis technique which extracts state-machine views of the control- and data-flows in MapReduce behavior from the native logs of Hadoop MapReduce systems, and it synthesizes these views to create a unified, causal view of MapReduce program behavior. This technique enables us to visualize MapReduce programs in terms of MapReduce-specific behaviors, aiding operators in reasoning about and debugging performance problems in MapReduce systems. We validate our technique and visualizations using a realworld workload, showing how to understand the structure and performance behavior of MapReduce jobs, and diagnose injected performance problems reproduced from real-world problems.
Year
DOI
Venue
2010
10.1109/ICDCS.2010.63
ICDCS
Keywords
Field
DocType
finite state machines,state-machine extraction,failure diagnosis,distributed nature,mapreduce program,large scale,distributed systems,mapreduce systems,program debugging,visualization,novel non-intrusive log-analysis technique,performance problem,data visualisation,program behaviors,log based causal tracing,nonintrusive log analysis technique,mapreduce job,hadoop mapreduce system,performance debugging,log-based causal tracing,cloud computing,performance behavior,debugging tools,debugging performance problem,mapreduce program behavior,mapreduce system,process control,java,statistical distributions,distributed computing,distributed system,state machine,debugging,data mining,data visualization,data flow
Data visualization,Yarn,Profiling (computer programming),Visualization,Computer science,Finite-state machine,Tracing,Debugging,Cloud computing,Distributed computing
Conference
ISSN
ISBN
Citations 
1063-6927
978-1-4244-7261-1
30
PageRank 
References 
Authors
1.36
22
4
Name
Order
Citations
PageRank
Jiaqi Tan141225.57
Soila Kavulya230116.27
Rajeev Gandhi365548.08
Priya Narasimhan41326111.22