Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University.

Similar presentations


Presentation on theme: "Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University."— Presentation transcript:

1 Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University

2 Background ISIS, Horus, Ensemble systems –Strong properties (for replicated data) –Adaptive (changing network/app behavior) Problems… –as fast as slowest receiver –“Jim Gray effect” –no IP Multicast

3 New Direction Probabilistically Strong Guarantees –Randomized protocols Compartmentalization No reliance on IP multicast, clock sync Auto-configuration, self-repair  JBI

4 Three Main Components Astrolabe –Aggregation Service SelectCast –Dissemination Service Bimodal Multicast –End-to-end reliability

5 Aggregation Ability to summarize information from distributed sources. aka data fusion in sensor networks. The basis for scalability! Standard service in databases. Why not in distributed systems?

6 Examples Barrier Synchronization Voting Resource Location Multicast Routing F

7 Astrolabe Astrolabe takes continuous snapshots of the global state of a distributed system, and aggregates this information in user- specified ways.

8 Four Design Principles Scalability through Hierarchy Flexibility through Mobile SQL Robustness through p2p Gossip Security through Certificates

9 DNS-like Domain Hierarchy Attribute list Domains identified by path names

10 MIB Each domain has an attribute list called “MIB” (management information base). MIBs of internal domains generated by aggregating child domains’ MIBs.

11 Domain Table No servers for any domain: a MIB is replicated on all hosts in its domain! Each host maintains not only the MIBs of its own domains, but also those of its sibling domains. Sibling MIBs organized in “domain tables”.

12 Domain Table Example IDCONTACTSISSUEDNMEMBERSMIN(LOAD) dom110.0.0.1 10.0.0.2 T150.31 dom210.0.1.1T2100.13 dom310.0.2.3T381.5 dom410.1.2.5 10.3.2.1 T4180.0

13 Aggregation idLoadWeblogic?SMTP?Word Version … swift2.0016.2 falcon1.5104.1 cardinal4.5106.0 idLoadWeblogic?SMTP?Word Version … gazelle1.7004.5 zebra3.2016.2 gnu.5106.2 idMin Load WL contactSMTP contact domain11.5123.45.61.3123.45.61.17 domain21.7127.16.77.6127.16.77.11 domain33.114.66.71.814.66.71.12 Domain1 Domain2 SQL query “summarizes” data Dynamically changing query output is visible domain- wide (like spreadsheet)

14 Example queries –SELECT SUM(nmembers) AS nmembers –SELECT MAX(depth) + 1 AS depth –SELECT MIN(minl) AS minl (minimum load) –… Functions gossiped with everything else.

15 Aggregation NameLoadWeblogic?SMTP?Word Version … swift2.0016.2 falcon1.5104.1 cardinal4.5106.0 NameLoadWeblogic?SMTP?Word Version … gazelle1.7004.5 zebra3.2016.2 gnu.5106.2 NameAvg Load WL contactSMTP contact SF2.6123.45.61.3123.45.61.17 NJ1.8127.16.77.6127.16.77.11 Paris3.114.66.71.814.66.71.12 Domain1 Domain2

16 Aggregation NameLoadWeblogic?SMTP?Word Version … swift2.0016.2 falcon1.5104.1 cardinal4.5106.0 NameLoadWeblogic?SMTP?Word Version … gazelle1.7004.5 zebra3.2016.2 gnu.5106.2 NameAvg Load WL contactSMTP contact SF2.6123.45.61.3123.45.61.17 NJ1.8127.16.77.6127.16.77.11 Paris3.114.66.71.814.66.71.12 Domain1 Domain2 O(log n) info per host

17 Other Examples 1.Which are the three lowest loaded hosts? 2.Which domains contain hosts with an out-of- date virus database? 3.Do >30% of hosts measure elevated radiation? 4.Which domains contain subscribers interested in some topic? 5.Where is the nearest logging server?

18 Epidemic or Gossip Protocols Used to keep domain tables up-to-date Randomized Communication between (nearby) hosts: –Fast (latency grows O(log n)) –Hard to stop (robust even in the face of Denial-of- Service attacks) –Probabilistically Real-Time guarantees on latency (based on epidemiological analysis).

19 How it works… IDCONTACTSISSUEDNMEMBERSMIN(LOAD) dom110.1.0.1 10.2.0.1 T150.23 dom210.3.0.1T310.3 dom310.4.0.1T480.0 IDCONTACTSISSUEDNMEMBERSMIN(LOAD) domA10.0.0.1 10.0.0.2 T520.31 domB10.0.1.1T610.13 domC10.0.2.3T721.5 domD10.1.2.5 10.3.2.1 T830.0 gossip SQL

20 SelectCast Disseminate messages through Astrolabe hierarchy (Application-level) Routers selected through domain aggregation: SELECT FIRST(3, routers) AS routers, MIN(minload) AS minload ORDER BY minload Exploit heterogeneity, don’t hide it!

21 Multicast Tree

22 Fault Masking

23 Filtering (Pub/Sub) SQL condition on each message For example: –MIN(version) < 3 –MAX(radiation) > 300 –OR(subject) // BLOOM FILTERS –TRUE Generalization of topic based publishing

24 Filtering Example

25 Scalability Latency, memory use, CPU load, load on network links, all grow O(log N), and independent of update rate. Highly robust to omission and crash failures. Confirmed by analysis, simulation, and experiment. O(1) lookup for most useful queries.

26 Emulab topology (U. Utah)

27 Experiments

28 Real vs. Simulation The real thingSimulation

29 Membership Domain failure detected when its attributes are no longer being updated. Domains discovered (and partitions repaired) through –gossip –occasional broadcast and multicast –configuration Special precautions for domains separated by firewalls and NAT boxes

30 Security Integrated PKI –integrity, no confidentiality –prevents “Sybil” Attacks Remove outliers –Summarize in a robust way Compartmentalize –Exploit domain hierarchy

31 Bimodal Multicast Probabilistic end-to-end reliability Uses IP Multicast or SelectCast for initial dissemination Runs a background gossip protocol to do repairs of message loss Performance improves with scale –share buffering load

32 Work in Progress Evaluate Scalability and Performance –emulation, simulation, deployment Improve support for low power apps –self configuration Improve expressiveness –pattern matching


Download ppt "Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University."

Similar presentations


Ads by Google