Edinburgh Napier University

Edinburgh Napier University
An Optimized Speculative Execution Strategy Based on Local Data Prediction in a Heterogeneous Hadoop Environment Edinburgh Napier University Xiaodong Liu and Qi Liu

Contents Background Introduction Related Work Model and Algorithm
Results and Evaluation Conclusion

Background Hadoop, which acts as the top project of Apache and one of the most popular cloud computing frameworks, has been widely adopted for its distributed features on data storage, computing and searching. Job scheduling is the core component of Hadoop and aims to divide a job into multiple tasks, and then provoke a JobTracker service to assign the tasks to corresponding TaskTracker nodes.

Background Distributing tasks as fast as possible cannot guarantee that subsequent execution in each TaskTracker still maintains its superiority [3], and may lead to the so-called slow tasks-Straggler. Speculative Execution (SE) is the current effective mechanism to recognize and correct inefficient allocation made by a JobTracker service so as to improve the fault tolerance feature of the Hadoop.

Related Work Due to the poor performance of Hadoop-naïve speculative execution strategy in heterogeneous environments, many optimized SE algorithm was proposed. LATE-using the remaining time as the speculative execution priority. MCP-optimizing the SE strategy by maximizing the benefits of launching backup tasks. ERUL-calculating the remaining time by the real-time system load and improves the accuracy of the prediction.

Model and Algorithm

Model and Algorithm (1) The Recognition of Straggler Candidates
The LWR method was implemented to calculate the remaining time of tasks. Where X is an input matrix, Y is the output vector. W is a diagonal weight function matrix.

Model and Algorithm A Gaussian kernel function is therefore used to calculate the weight function ω(d) ,where γ is the wave-length parameter and is set to 0.08 in this paper.

One slot for trem-tbackup
Model and Algorithm (2) The Benefit Calculation of Replicating Stragglers SE Enabled SE Disabled Cluster Consumption Two slots for tbackup One slot for trem Cluster Benefits One slot for trem-tbackup trem is the remaining time predicted by the LWR model, tavg is the average execution time of completed tasks. μ is introduced to avoid the influence of the data skew of the input data.

Model and Algorithm (3) The Selection of Backup Nodes
To enhance the performance of SE, we proposes a new method to measure and assess potential backup nodes by dividing the nodes into two good-at groups, i.e. “Map-Fast” nodes and “Reduce-Fast” nodes. PR represents the processing rate of node candidates.

Results & Evaluation The detailed information of experimental environment NodeID Memory(GB) Core Processors Node 1 10 8 Node 2 4 Node 3 1 Node 4 Node 5 Node 6 Node 7 18 Node 8 12

Results & Evaluation Job execution time and Cluster Throughtput of different SE strategies on Wordcount jobs in a normal load scenario

Results & Evaluation Job execution time and Cluster Throughtput of different SE strategies on Wordcount jobs in a busy load with data skew scenario

Conclusion LWR-SE was proposed inspired by the non-linear relationship between job execution time and progress. The experimental results have shown that the LWR-SE outperforms the MCP, LATE and Hadoop-None in three different heterogeneous scenarios designed with either normal or busy workloads

Thank you!

Edinburgh Napier University

Similar presentations

Presentation on theme: "Edinburgh Napier University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Edinburgh Napier University

Similar presentations

Presentation on theme: "Edinburgh Napier University"— Presentation transcript:

Similar presentations

About project

Feedback