1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research.

1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research

2 Global Event Notification Services Communication via event notification (also called publish/subscribe) is well-suited for loosely- coupled eCommerce applications, as well as Internet-scale distributed applications (e.g. instant messaging and multi-player games). General event notification systems currently: –scale to tens of thousands of clients, –do not have global reach.

3 Internet-scale Issues Scaling requirements are millions and billions, perhaps more. There will (probably) not be a single organization that owns the entire event notification infrastructure. Hence a federated design is required. Global reach implies that failures and network partitions will be common-place.

4 Focus on the Basic Distributed Systems Primitives Focus on the scalability of basic message delivery and distributed state management capabilities. Employ a very simple message-oriented design and assume – until proven otherwise – that richer event notification semantics can be layered on top.

5 Herald Event Notification Model CreatorPublisherSubscriber 1: Create Rendezvous Point 2: Subscribe 3: Publish 4: Notify Rendezvous Point Herald Service

6 Design Criteria The “usual” criteria: –Scalability –Resilience –Self-administration –Timeliness Additional criteria: –Heterogeneous federation –Security –Support for disconnection –Partitioned operation

7 Scalability 10 11 Rendezvous Points (RPs) 10 11 publishers & subscribers in aggregate 10 10 publishers & subscribers per RP 10 10 federation members 10 2 events/sec/RP

8 Resilience “Fail last, fail least” semantics. Correct operation in the presence of malicious/corrupt participants.

9 Self-administration System decides where to place state and how to propagate information about state changes. System dynamically adapts to changing loads and the presence of faults and network partitions. No manual tuning.

10 Timeliness Event notification should normally take seconds not hours.

11 Heterogeneous Federation Federation of machines within cooperating but mutually suspicious domains of trust. Federated parties may include both small and large domains.

12 Security Support restricted access to Herald facilities. Support concepts such as groups and roles.

13 Support for Disconnection Eventual delivery to disconnected subscribers. Event histories to allow a posteriori examination of the past.

14 Partitioned Operation Continued operation on both sides of a network partition. Eventual (out-of-order) delivery after partition healing.

15 Non-Goals What’s the “best” way to do: –Naming –Filtering –Complex subscription queries In-order delivery (except as layered on top)

16 Applying Lessons of the Internet and Web Assume things are broken: –Mutual suspicion and no dependence on correct behavior by others. Don’t try to fix everything: –All distributed state is maintained in a weakly- consistent soft-state manner and is aged. –All distributed state is incomplete and may be inaccurate.

17 Design Overview We think we only need these mechanisms: –Replication. –Overlay distribution networks. –Time contracts. –Event histories. –Administrative rendezvous points.

18 Replication RP1@L1 RP2@L1 Herald@L1 RP1@L2 Herald@L2 RP1@L3 Herald@L3 Pub1 Sub2 Pub2 Sub1 Sub4 Sub5 Pub3 Sub3

19 Overlay Distribution Networks RP1@L1 Herald@L1 RP1@L2 Herald@L2 RP1@L3 Herald@L3 Pub1Sub2Sub1Sub4Pub2 RP1@L3 Herald@L3

20 Time Contracts CreatorPub1Sub1 RP1 Herald Service Creator RP1 Pub1 Sub1 60 10 30

21 Event Histories CreatorPub1Sub1 RP1 Herald Service Creator RP1 Pub1 Sub1 60 10 30 History50

22 Administrative Rendezvous Points RP1 Herald Service Name Service 1. Subscribe RP1@ 2. Notify(change)

23 Engineering & Research Issues Baseline scalability numbers Dynamic system reconfiguration Federation and security

24 Baseline Scalability Numbers How scalable are single-node servers and server clusters? What are multicast-style delivery systems actually capable of, especially in aggregate?

25 Dynamic System Reconfiguration Reconfiguring distributed RP state in response to aggregate workloads and global state changes. Dealing with “flash crowd” loads. Placement of RP state to minimize the effects of network partitions and disconnection. Placement of RP state to enable efficient implementations of higher-level pub/sub semantics.

26 Federation and Security Can we define simple, open protocols? Will we need heavy-weight mechanisms to deal with malicious/corrupt servers? How should anonymity and privacy be dealt with/supported?

27 Related Work Non-global event notification systems (Gryphon, Ready, Siena, …) Netnews P2P systems such as Gnutella and Farsite Overlay & multicast networks CDNs OceanStore

28 Conclusion Global event notification is emerging as a key Internet technology. Herald is exploring scalability of the basic message and distributed state management aspects of an event notification system: –Gain engineering experience with scalable pub/sub systems. –Explore dynamic system reconfiguration. –Understand the implications of federation and security.

1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research.

Similar presentations

Presentation on theme: "1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research.

Similar presentations

Presentation on theme: "1 Herald: Achieving a Global Event Notification Service Luis Felipe Cabrera, Michael B. Jones, Marvin Theimer Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback