Heron: a stream data processing engine

Heron: a stream data processing engine
Shreya

Real Time Streaming by Storm
Storm used to be the main platform for providing real time analytics at Twitter Storm topology is a directed graph of spouts and bolts Spouts: sources of input data/tuples (stream of tweets) Bolts: abstraction to represent computation on the stream. Like real-time active user counts (RTAC) Spout and bolts are run as tasks. A lot of work on

Storm Worker Multiple tasks are grouped into an executor
Multiple executors are grouped into a worker Each worker runs as a JVM process. Worker processes are scheduled by the OS Each executor is mapped to two threads scheduled using preemptive priority based by the JVM. Each executor has a scheduler Worker process = jvm process Jvm= java vm. Runs a program. Complex scheduling problem

Limitations of Storm As the scale and diversity of the data increased the Storm’s limitations became apparent These many layers of scheduling makes it a much more complex problem and more difficult to predict performance Since each worker can run separate tasks, you could potentially have spouts and bolts from different sources depending on the same resources No resource isolation

Limitations of Storm Makes debugging difficult because when restarting it is possible to find the erroneous task scheduled with different tasks therefor difficult to find. Also if an entire worker process is killed it will hurt other running tasks. The resource allocation methods can lead to overprovisioning. Becomes more complicated as you increase the types of components being put in worker Hurting one topology can hurt others 3 spouts and 1 bolt example, gives 15 for each because one of the workers needs 10+5, and the other just 10, so we allocation 30 instaed of 25, takes the max needed a cleaner mapping from the logical units of computation to each physical process. The importance of such clean mapping for debug-ability is really crucial when responding to pager alerts for a failing topology, especially if it is a topology that is critical to the underlying business model. Requires sharing the cluster resources

Limitations of Storm As the number of bolts/ tasks increases in a worker, each worker tends to be connected to other workers. There are not enough ports in each worker to communicate: not scalable. A new engine was needed that could provide better scalability, sharing of cluster resources, and debuggability. Because the multiple components of the topology are bundled into one OS process, debugging is difficult.

Storm Nimbus Limitations
Schedules, monitors, and distributes JARs. It also Manages counters for several topologies All these tasks make Nimbus the bottleneck Does not support resource isolation at granular level. Workers in different topologies on the same machine can interfere with each other. Solution: run entire topologies on one machine. -> Leads to waste of resources Uses Zookeeper to manage heartbeats from workers. Becomes the bottleneck for large numbers of topologies When Nimbus fails, users can’t modify topologies and failures cannot be automatically detected.

Efficiency Reduced performance were often caused by
Suboptimal Replays: tuple failure anywhere leads to failure of the whole tree Long Garbage Collection Cycles: High RAM usage for GC cycles results in high latencies and high tuple failures Queue contention: there is contention for transfer queues, Storm has dealt with this issues by overprovisioning

Heron Runs topologies (directed acyclic graph of spouts and bolts).
Programmer specifies number of tasks for each spout and bolt, how data is partitioned across spout. Tuple Processing follows At most once: no tuple proessed more than once At least once: Each tuple is guaranteed to be processed at least once. Too many fundamental issues so heron was the answer Api compatible with storm so easy to migrate

Architecture Deploy topologies to Aurora Scheduler using the Heron CL tool Aurora: generic service scheduler. Runs a framework on top of Mesos This scheduler can be replaced by a different one Each topology is an Aurora Job consisting of containers Abstraction scheduler

Architecture The first container is the Toplogy Master
The rest are Stream Manager, Metris Manger, and Heron Instances (one JVM) Containers are scheduled by Aurora Processes communicate with each other using protocol buffers

Topology Master Responsible for managing topologies and serves as point of contact for discovering status of the topology A topology can only have one TM Provides an endpoint for toplogy metrics.

Stream Manager Manage routing of tuples efficiently.
Heron Instances connect to the local SM to send and receive tuples. There are O(k2) connections where k is the number of containers/SMs Stage by Stage Backpressure: propagate backpressure stage by stage until reaches the spouts High and low water mark Allows traceability by finding what triggered the backpressure This design prevents rapid oscillation from going into and out of backpressure Sending start and stop backpressure is overhead. Spout Backpressure: Clamps on upstream components (spouts) when His are slowing down Dynamically adjusts rate at which data flows. For example if upstream is too fast, then this will lead to large queues downstream Uses tcp windowing mechanism for backpressure. This works because production consumption is equal to send receive rate of buffer so good metric. However slow downs upstream led to slowdowns backstream too so really slow process getting out of slow zones

Heron Instance Each Heron Instance is a JVM process.
Single threaded approach: maintains TCP comuniation to local SM. Waits for tuples to invoke user logic code. Output is buffered until threshold is met and all is delivered to SM. User can block because of read/writes, calling the sleep system call Two threaded Approach: Gateway thread and task execution thread

Heron Instance Gateway thread: controls communication and data movement in and out. Maintains TCP connections to SM and metrics manager. Task Execution thread: receives tuples and runs user codes. Sends output to gateway thread. GC: can be triggered before gateway sends out tuples => periodically check queue size When data in exceeds bound action triggers backpressure mechanism. Tm writes physical plan to zookeeer to rediscover state in case of failure

Start up Sequence Topology is submitted to Heron
The scheduler allocates necessary resources and schedules containers in several machiens TM comes up on first container and makes itself discoverable using Zookeeper. SM on each container conults Zookeeper to discover TM When connections ae all set up, TM runs assignment algorithm to different components. Begins execution

Failure Scenarios Deaths of processes, failure of containers, failure of machines If TM process dies, the container restarts that process and recovers state from Zookeeper. standby TM becomes master TM and restarted master TM becomes standby If SM dies, it rediscoversss TM upon restart. When HI dies, it is restarted and gets copy of plan from SM

Heron Tracker/Heron UI
Saves metadata and collects information on topology. Provides an API to create further tools The UI provides a visual representation of the topologies. Can view the acyclic graph See statistics on specific components. Heron Viz automatically creates a dashboard that provides health and resource monitoring. Has resulted in a 3x reduction in hardware at Twitter.

Evaluation Tested against Word Count toplogy and RTAC toplogy. Considered at least once and at most once semantics Run on machines with 12 cores, 72 GB of RAM, and 500 GB of disk. Assumed no out of memory crashes or repetitive GC cycles.

Word Count Topology Spouts were could produce quickly
Storm’s convoluted queue structure led to more overhead. Heron demonstrates better throughput and efficiency Similar results for disabled acknowledgments and RTAC topologies his increase is a result of the SMs becoming a bottleneck, as they need to maintain buffers for each connection to the other SMs, and it takes more time to consume data from more buffers. The tuples also live in these buffers for longer time given a cons tant input rate (only one spout instance per container).

Summary Resource provisioning is abstracted from cluster manager, allowing isolation A single heron instance runs only a spout or bolt allowing better debugging Step by step backpressure makes clear which component is failing Heron is efficient because there is component level resource allocation A TM allows each topology to be managed independently and failures of one to not interfere with another

Heron: a stream data processing engine

Similar presentations

Presentation on theme: "Heron: a stream data processing engine"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Heron: a stream data processing engine

Similar presentations

Presentation on theme: "Heron: a stream data processing engine"— Presentation transcript:

Similar presentations

About project

Feedback