R-Storm: Resource Aware Scheduling in Storm

R-Storm: Resource Aware Scheduling in Storm
Boyang Peng Mohammad Hosseini Zhihao Hong Reza Farivar Roy Campbell

Introduction STORM is an open source distributed real-time data stream processing system Real-time analytics Online machine learning Continuous computation

A scheduler that takes into account resource availability and resource requirement Satisfy resource requirements Improve throughput by maximizing resource utilization and minimizing network latency Your date comes here

Storm Overview Reference: storm.apache.org Your footer comes here
Your date comes here

Storm Topology Reference: storm.apache.org Your footer comes here

Definitions of Storm Terms
Tuples - The basic unit of data that is processed. Stream - an unbounded sequence of tuples. Component - A processing operator in a Storm topology that is either a Bolt or Spout Tasks - A Storm job that is an instantiation of a Spout or Bolt. Executors - A thread that is spawned in a worker process that may execute one or more tasks. Worker Process - A process spawned by Storm that may run one or more executors. Your footer comes here Your date comes here

Logical connections in a Storm Topology
Your footer comes here Your date comes here

Storm topology physical connections

Naïve round robin scheduler Naïve load limiter (Worker Slots)
Motivation Naïve round robin scheduler Naïve load limiter (Worker Slots) Resources that don’t have graceful performance degradation Memory Your footer comes here Your date comes here

Micro-benchmark 30-47% higher throughput For Yahoo! Storm applications R-Storm outperforms default Storm by around 50% based on overall throughput. Your footer comes here Your date comes here

Targeting 3 types of resources Limited resource budget for each node
Problem Formulation Targeting 3 types of resources CPU, Memory, and Network Limited resource budget for each node Specific resource needs for each task Goal: Improve throughput by maximizing utilization and minimizing network latency

Problem Formulation Set of all tasks Ƭ = {τ1 , τ2, τ3, …}, each task τi has resource demands CPU requirement of cτi Network bandwidth requirement of bτi Memory requirement of mτi Set of all nodes N = {θ1 , θ2, θ3, …} Total available CPU budget of W1 Total available Bandwidth budget of W2 Total available Memory budget of W3

Problem Formulation Qi : Throughput contribution of each node Assign tasks to a subset of nodes N’ ∈ N that minimizes the total resource waste:

 Quadratic Multiple 3D Knapsack Problem
Problem Formulation  Quadratic Multiple 3D Knapsack Problem We call it QM3DKP! NP-Hard! Compute optimal solutions or approximate solutions may be hard and time consuming Real time systems need fast scheduling Re-compute scheduling when failures occur

Soft Constraints Vs Hard Constraints
CPU and Network Resources Graceful performance degradation with over subscription Hard Constraints Memory Oversubscribe -> Game over Your footer comes here Your date comes here

Observations on Network Latency
1. Inter-rack communication is the slowest 2. Inter-node communication is slow 3. Inter-process communication is faster 4. Intra-process communication is the fastest Your footer comes here Your date comes here

Designing a 3D resource space
Heuristic Algorithm Greedy approach Designing a 3D resource space Each resource maps to an axis Can be generalized to nD resource space Trivial overhead! Based on: min (Euclidean distance) Satisfy hard constraints For each task with a certain resource vector that represents its resource requirement we attempt to find the node with the resource vector that represents its resource availability that is closest Based on min (Euclidean distance) while not violating hard constraints

Heuristic Algorithm Your footer comes here Your date comes here

Heuristic Algorithm Switch 1 2 3 4 5 6 Your footer comes here

Our proposed heuristic algorithm has the following properties:
Tasks of components that communicate will each other will have the highest priority to be scheduled in close network proximity to each other. No hard resource constraint is violated. Resource waste on nodes are minimized. Based on min (Euclidean distance) while not violating hard constraints

Evaluation Micro-benchmarks Industry inspired benchmarks
Linear Topology Diamond Topology Star Topology Industry inspired benchmarks PageLoad Topology Processing Topology Network Bound versus Computation Bound

1 host for Nimbus + Zookeeper 12 hosts as worker nodes All hosts:
Evaluation Setup Used Emulab.net as testbed and to emulate inter-rack latency across two sides 1 host for Nimbus + Zookeeper 12 hosts as worker nodes All hosts: Ubuntu LTS 1-core Intel CPU 2GB RAM+ 100Mb NIC V0 1 2 3 5 6 4 V1 7 8 9 11 12 10

Micro-benchmarks Linear Topology Diamond Topology Star Topology

Network Bound vs CPU Bound
Network Bottleneck Speed of light test CPU Bound Computation Bound Arbitrary computation (Find prime numbers) Your footer comes here Your date comes here

Network-bound Micro-benchmark
50% improvement Your footer comes here Your date comes here

Network-bound Micro-benchmark Topologies

Computation Bound Micro-benchmark
These experiments are not exactly a fair comparison but serve to illustrate an important point that more is not always better. Im giving storm more resource but its not running any faster Your footer comes here Your date comes here

CPU Utilization Your footer comes here Your date comes here

Lesson Learn for Computation bound
More nodes might not be better while parallelism is constant. Need the right amount Having more nodes and Spreading executors unnecssarily far will increase network distance and latency Your footer comes here Your date comes here

Yahoo Sample Topologies
Layout of topologies provided by Yahoo Implemented in abstractly for our evaluation These two topologies are used by Yahoo! for processing event-level data from their advertising platforms to allow for near real-time analytical reporting. Your footer comes here Your date comes here

PageLoad Topology 50% improvement Your footer comes here

Processing Topology 47% improvement Your footer comes here

Throughput comparison of running multiple topologies

Multiple Topologies Results
PageLoad topology R-Storm (25496 tuples/10sec) Default Storm (16695 tuples/10sec) R-Storm is around 53% higher Processing topology R-Storm (67115 tuples/10sec) Default Storm (10 tuples/sec). Orders of magnitude higher Your footer comes here Your date comes here

Merged with Apache Storm 0.11 Release
Current Status Merged with Apache Storm 0.11 Release Starting to be used a Yahoo! Inc. Your footer comes here Your date comes here

Experiments to quantify latency Multi-tenancy
Future Work Experiments to quantify latency Multi-tenancy Topology Priorities and Per User Resource Guarantees Elasticity Dynamic adjusting parallelism to meet SLA Your footer comes here Your date comes here

Questions? Your footer comes here Your date comes here

R-Storm: Resource Aware Scheduling in Storm

Similar presentations

Presentation on theme: "R-Storm: Resource Aware Scheduling in Storm"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

R-Storm: Resource Aware Scheduling in Storm

Similar presentations

Presentation on theme: "R-Storm: Resource Aware Scheduling in Storm"— Presentation transcript:

Similar presentations

About project

Feedback