Evolution of messaging systems and event driven architecture

Slides:



Advertisements
Similar presentations
An Erlang Implementation of Restms. Why have messaging? Separates applications cheaply Feed information to the right applications cheaply Interpret feed.
Advertisements

COM vs. CORBA.
Database Architectures and the Web
Message Queues COMP3017 Advanced Databases Dr Nicholas Gibbins –
Kafka high-throughput, persistent, multi-reader streams
Felix Ehm CERN BE-CO. Content  Introduction  JMS in the Controls System  Deployment and Operation  Conclusion.
Distributed Processing, Client/Server, and Clusters
Big Data Open Source Software and Projects ABDS in Summary XVI: Layer 13 Part 1 Data Science Curriculum March Geoffrey Fox
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Service Broker Lesson 11. Skills Matrix Service Broker Service Broker, provides a solution to common problems with message delivery and consistency that.
Understanding and Managing WebSphere V5
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
© 2011 Cisco All rights reserved.Cisco Confidential 1 APP server Client library Memory (Managed Cache) Memory (Managed Cache) Queue to disk Disk NIC Replication.
Client Server Technologies Middleware Technologies Ganesh Panchanathan Alex Verstak.
Microsoft Visual Studio 2010 Muhammad Zubair MS (FAST-NU) Experience: 5+ Years Contact:- Cell#:
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 4 Communication.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
(Business) Process Centric Exchanges
Distributed Computing Systems CSCI 4780/6780. Geographical Scalability Challenges Synchronous communication –Waiting for a reply does not scale well!!
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Messaging. Message Type Patterns Command Invoke a procedure in another application SOAP request is an example Document Message Single unit of information,
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Geo-distributed Messaging with RabbitMQ
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
Spring RabbitMQ Martin Toshev.
Making Sense of Service Broker Inside the Black Box.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Apache Kafka A distributed publish-subscribe messaging system
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook
SysPlex -What’s the problem Problems are growing faster than uni-processor….1980’s Leads to SMP and loosely coupled Even faster than SMP and loosely coupled.
Robert Metzger, Aljoscha Connecting Apache Flink® to the World: Reviewing the streaming connectors.
Pilot Kafka Service Manuel Martín Márquez. Pilot Kafka Service Manuel Martín Márquez.
Messaging in Distributed Systems
Course: Cluster, grid and cloud computing systems Course author: Prof
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Kafka & Samza Weize Sun.
Replicated LevelDB on JBoss Fuse
Amazon Storage- S3 and Glacier
Open Source distributed document DB for an enterprise
The Client/Server Database Environment
Introduction to NewSQL
A Messaging Infrastructure for WLCG
Software Engineering Introduction to Apache Hadoop Map Reduce
Storage Virtualization
#01 Client/Server Computing
Replication Middleware for Cloud Based Storage Service
Software Architecture
Ewen Cheslack-Postava
Harjutus 3: Aünkroonne hajussüsteemi päring
Architectures of distributed systems Fundamental Models
AWS Cloud Computing Masaki.
Software models - Software Architecture Design Patterns
Architectures of distributed systems Fundamental Models
Web Application Server 2001/3/27 Kang, Seungwoo. Web Application Server A class of middleware Speeding application development Strategic platform for.
Message Queuing.
Enterprise Integration
Architectures of distributed systems
J2EE Lecture 13: JMS and WebSocket
THE GOOGLE FILE SYSTEM.
Architectures of distributed systems Fundamental Models
Standards, APIs, and Applications
DBOS DecisionBrain Optimization Server
Caching 50.5* + Apache Kafka
#01 Client/Server Computing
Demo for Partners and Customers
Presentation transcript:

Evolution of messaging systems and event driven architecture Suresh Pandey Template version: 11/20/2012, for PowerPoint 2007 & 2010

Advanced Messaging Queuing Protocol RabbitMQ architecture Topics to Cover Messaging Overview Java Messaging System Advanced Messaging Queuing Protocol RabbitMQ architecture Evolution of Kafka Kafka Architecture and Design Scalability and Performance Kinesis Overview Kinesis vs. Kafka Sources

Messaging overview Loan Processor Credit Policy Underwriting Funding Asynchronous mode of communication. Sender & Receiver interact through message broker. Messages in queue stored until the recipient retrieves them. Implicit or Explicit Limits on size of data transmitted in a single message and the number of messages that may remain outstanding on the queue. Messaging overview Point to point system without messaging Loan Processor Credit Policy Underwriting Funding

Loan Processor sub-prime Loan Processor Sub-Prime Loan Processor prime Loan Processor sub-prime Adding new system requires p2p connection with all existing systems Credit Policy Underwriting Funding Error prone and difficult to integrate with new systems Loan Processor Prime Loan Processor Sub-Prime Easier to integrate new systems by using BUS architecture Less connection to be maintained by each application Credit Policy Underwriting Funding Asynchronous processing

Java Messaging System (JMS) Java Message Producer JMS API Java Messaging System (JMS) OpenWire ActiveMQ OpenWire Can easily replace ActiveMQ with any other JMS compliant broker (IBM mq,HornetQ from JBoss) with little or no changes. Instead of OpenWire, you would than be using the native protocol for HornetQ or IBM mq. Within the java platform the protocol used by messaging broker doesn’t matter Java Message Service (JMS) is an application program interface (API) specs from Sun Microsystems Java Message Consumer JMS API

Java Messaging System (JMS) Java Message Producer JMS API ActiveMQ Java Message Consumer OpenWire Java Messaging System (JMS) A limitation of JMS is that the APIs are specified, but the message format is not JMS has no requirement for how messages are formed and transmitted. Essentially, every JMS broker can implement the messages in a different format. They just have to use the same API Java Message Service (JMS) is an application program interface (API) specs from Sun Microsystems

Java Messaging System (JMS) Java Message Producer JMS API Java Messaging System (JMS) OpenWire ActiveMQ STOMP Issues: First, both protocol may not support the same message body types Second, you would essentially be locked into one specific vendor solution Finally, both protocol may not support the same data types, custom properties and header properties While cross-platform interoperability is certainly possible, it is restrictive, limited and forces vendor lock- in Ruby can’t use JMS , you need a message broker that can bridge the two platforms and transform the protocol and message structure used by each platform Supports both STOMP and JMS simultaneously ActiveMQ contains built-in message bridge for JMS to STOMP and vice versa conversion Ruby Message Consumer STOMP CLIENT

Advanced Message Queuing Protocol (AMQP) Java Message Producer RabbitMQ Client Advanced Message Queuing Protocol (AMQP) AMQP AMQP Broker RabbitMQ AMQP Solve the problem of messaging interoperability between heterogeneous platforms by defining standard binary level protocol. By defining a wire-level protocol standard for messaging , AMQP creates a message-based interoperability model that is completely client API or server (message broker) agnostic As long as you are using AMQP you can connect and send messages to any AMQP message broker using any AMQP client Ruby Message Consumer Qpid client

Single Point of Failure Traditional architecture Single Point of Failure Server2 Server1 Server4 Server3 Consumer 1 Consumer 2 LOAD BALANCER PRODUCER Highly Available Scalable Performance

RabbitMQ ▸ Widely deployed open source message broker based on AMQP AMQP Producer ▸ Widely deployed open source message broker based on AMQP ▸ Distributed broker, provide cluster for load balancing. ▸ High Availability(HA) mode by mirroring queues. ▸ Supports fully transactional communication between clients and brokers using acknowledgements. ▸ Great support in Spring framework. N1 N2 N3 AMQP Consumer Widely deployed open source message broker based on AMQP. ▸ Distributed broker, provide cluster for load balancing. ▸ High Availability(HA) mode by mirroring queues. ▸ Supports fully transactional communication between clients and brokers using acknowledgements. ▸ Great support in Spring framework. ▸ focused on consistent delivery of messages to consumers

RabbitMQ Architecture Smart broker / dumb consumer model 1 Publishers send messages to exchanges, and consumers retrieve messages from queues 2 RabbitMQ cluster distributes queues for the Read/Write load distribution. 3 High Availability(HA) mode by mirroring queues. 4

Single point of failure Highly Available Scalable Performance Q1 Master Q2 Mirrored Q3 Mirrored LB EXCHANGE Node 1 Producer Consumer Node 2 Q2 Master Q3 Mirrored Q1 Mirrored EXCHANGE Node 3 Q3 Master Q1 Mirrored Q2 Mirrored EXCHANGE 12

Kafka Producer: Application that sends the messages. Consumer: Application that receives the messages. Topic: A Topic is a category/feed name to which messages are stored and published. Topic partition: Kafka topics are divided into a number of partitions. Partitions allow you to parallelize a topic by splitting the data in a Kafka particular topic across multiple brokers. Replicas A replica of a partition is a "backup" of a partition. Replicas never read or write data. They are used to prevent data loss. Consumer Group: A consumer group includes the set of consumer processes that are subscribing to a specific topic. Offset: The offset is a unique identifier of a record within a partition. It denotes the position of the consumer in the partition. Cluster: A cluster is a group of nodes i.e., a group of computers.

Kafka employs a dumb broker smart consumers model. Kafka broker does not attempt to track which messages were read by each consumer. Kafka retains all messages for a set amount of time, and consumers are responsible to track their location in each log 

Kafka Message Log WRITES Older Newer Partition 0 1 2 3 4 5 6 7 8 9 10 For each topic, the Kafka cluster maintains a partitioned log Partition 0 1 2 3 4 5 6 7 8 9 10 11 12 Partition 1 1 2 3 4 5 6 7 8 9 Partition 2 1 2 3 4 5 6 7 8 9 10 11 WRITES Partition 3 1 2 3 4 5 6 7 8 Older Newer

Consumer Groups 16 Topic A, Partition 0 Topic A, Partition 1 Producer 1 Producer 2 Topic A, Partition 0 Topic A, Partition 1 Topic B, Partition 0 Topic A, Partition 2 Topic B, Partition 1 Topic B, Partition 2 Consumer : Topic A Consumer Group X Consumer : Topic B Consumer Group Y Node 1 Node 2 Consumer Groups 16

Rebalancing Consumer Groups Producer 1 Producer 2 Topic A, Partition 0 Topic A, Partition 1 Topic B, Partition 0 Topic A, Partition 2 Topic B, Partition 1 Topic B, Partition 2 Consumer : Topic A Consumer Group X Consumer : Topic B Consumer Group Y Node 2 Node 1 Rebalancing Consumer Groups 17

Rebalancing Consumer Groups Consumer Group Z Consumer : Topic A Node 1 Topic A, Partition 0 Topic A, Partition 1 Consumer : Topic A Rebalancing Consumer Groups Topic B, Partition 1 Consumer Group X Producer 1 Consumer : Topic A Node 2 Topic B, Partition 0 Consumer : Topic B Producer 2 Topic A, Partition 2 Topic B, Partition 2 Consumer Group Y Consumer : Topic B Consumer : Topic B 18

Managing Kafka Cluster   Managing Apache Kafka Monitoring Apache Kafka Brokers failures Monitoring Disk, CPU and Memory) Monitoring partition throughput Migrating Apache Kafka partitions to new nodes to increase throughput Upgrading Apache Kafka version Failing over to a different cluster in a different data-center or availability-zone Multi-AZ deployment Managing Zookeeper Monitoring Apache Zookeeper node failures Monitoring Disk, CPU and Memory Apache Zookeeper JVM Tuning Scaling Zookeeper nodes to increase CPU, Memory and Disk resources Upgrading Zookeeper version Multi-AZ deployment Managing Kafka Cluster

Kinesis Kafka Topic Partition Broker Kafka Producer Kafka Consumer Offset Replication Kinesis Stream Shard N/A Kinesis Producer Kinesis Consumer Sequence Number Not required Fully-managed streaming processing service available on Amazon Web Services (AWS). Easily scalable to match data volume using apis. Data replication across 3 AZs Integrated with other AWS services

Scalability: Kafka Depends on the number of CPU cores, memory, and the performance of the local disks. To increase the throughput of the system having more partition than nodes, the users have to add more hardware capacity to the cluster and migrate the existing partitions to the newly added resources. To increase the throughput of the cluster having node equals to partition, one has to add more resources to the existing resources, also known as scaling up. Apache Kafka holds historical data, users may be required to increase the disk footprint capacity of the cluster.

Scalability: Kinesis Throughput of each Shard is pre-advertised. 1000 PUT records-per sec or 1MBps of write. 2Mbps or 5-transactions per-sec of read traffic. In order to increase the throughput of a given Amazon Kinesis stream, more Shards can be added using apis or using AWS console. The benefit of the Amazon Kinesis throughput model is that the users have a prior knowledge of the exact performance numbers to expect for every provisioned Shard. Current read limit of 5 transaction-per second limits on how many applications can read from one shard at any given time.

Latency Durability Delivery Semantics Kafka Configured to perform < 1second Kinesis Latency in the range of 1-5 seconds. Applications that require < 1-second latency are not an ideal use-case for Amazon Kinesis. Durability Kafka Provides durability by replicating data to multiple broker nodes Kinesis Provides the same durability guarantees by replicating the data to multiple availability zones. Delivery Semantics Both Apache Kafka and Amazon Kinesis provide at-least-once delivery semantic.

Kafka vs Kinesis: Cost Factor  Managing Kafka Monitoring Apache Kafka Brokers failures Monitoring Disk, CPU and Memory) Migrating Apache Kafka partitions to new nodes to increase throughput Tuning Apache Kafka JVM settings Scaling Brokers to increase CPU, Memory and Disk resources Upgrading Apache Kafka version Recovering/Replacing failed Brokers Failing over to a different cluster in a different data-center or availability-zone Multi-AZ deployment Managing Kinesis N/A  handled by Kinesis Service Amazon Kinesis API to add & remove Shard   Kafka vs Kinesis: Cost Factor

https://aws.amazon.com/kinesis/streams/faqs/ Sources http://www.wmrichards.com/amqp.pdf http://go.datapipe.com/whitepaper-kafka-vs-kinesis-download http://docs.aws.amazon.com/streams/latest/dev/key-concepts.html https://aws.amazon.com/kinesis/streams/faqs/