Presentation is loading. Please wait.

Presentation is loading. Please wait.

UC Berkeley Monitoring Hadoop through Tracing Andy Konwinski and Matei Zaharia.

Similar presentations


Presentation on theme: "UC Berkeley Monitoring Hadoop through Tracing Andy Konwinski and Matei Zaharia."— Presentation transcript:

1 UC Berkeley Monitoring Hadoop through Tracing Andy Konwinski and Matei Zaharia

2 Objectives Debug and profile data center applications –Hadoop file system and map-reduce –Apache Nutch web indexing engine Automatically detect problems from traces

3 State-of-the-Art Unpublished proprietary log management systems at Google, Yahoo, etc Per-machine logs Sawzall for mining log data Node monitoring daemon (System Health Infrastructure)

4 Our Idea Capture causality directly by tracing computations across nodes using X-Trace Use machine learning to detect problems –Detect unusual runs using unsupervised learning –Classify problems using supervised learning Also want to study Hadoop performance

5 Risks Scaling X-Trace data collection Analyzing X-Trace reports in real time Identifying features of X-Trace graphs to run machine learning on Our manually induced errors may not capture all failures that happen in a production cluster

6 The Plan WeekMilestone Oct 15Deploy Hadoop on I3 cluster Oct 22Collect traces for simple failure modes Oct 29Finalize tracing granularity / features Nov 5Deploy and trace large application Nov 12Evaluate machine learning techniques Nov 19More machine learning, measure overhead of tracing, wrap-up Nov 26Poster session Dec 3Final report


Download ppt "UC Berkeley Monitoring Hadoop through Tracing Andy Konwinski and Matei Zaharia."

Similar presentations


Ads by Google