Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intel Research Timothy Roscoe P2: Implementing Declarative Overlays Timothy Roscoe Boon Thau Loo, Tyson Condie, Petros Maniatis, Ion Stoica, David Gay,

Similar presentations


Presentation on theme: "Intel Research Timothy Roscoe P2: Implementing Declarative Overlays Timothy Roscoe Boon Thau Loo, Tyson Condie, Petros Maniatis, Ion Stoica, David Gay,"— Presentation transcript:

1 Intel Research Timothy Roscoe P2: Implementing Declarative Overlays Timothy Roscoe Boon Thau Loo, Tyson Condie, Petros Maniatis, Ion Stoica, David Gay, Joseph M. Hellerstein, Intel Research Berkeley and U. C. Berkeley P2

2 Timothy Roscoe Intel Research 2 Overlays: a broad view Overlay: the routing and message forwarding component of any non-trivial distributed system Internet Overlay

3 Timothy Roscoe Intel Research 3 Overlays Everywhere… Many examples: Many examples: Internet Routing, multicast Internet Routing, multicast Content delivery, file sharing, DHTs, Google Content delivery, file sharing, DHTs, Google Microsoft Exchange Microsoft Exchange Tibco (technology interoperation) Tibco (technology interoperation) Overlays are a fundamental tool for repurposing communication infrastructures Overlays are a fundamental tool for repurposing communication infrastructures Get a bunch of friends together and build your own ISP (Internet evolvability) Get a bunch of friends together and build your own ISP (Internet evolvability) You dont like Internet Routing? Make up your own rules (RON) You dont like Internet Routing? Make up your own rules (RON) Paranoid? Run Freenet Paranoid? Run Freenet Intrusion detection with friends (DDI, Polygraph) Intrusion detection with friends (DDI, Polygraph) Have your assets discover each other (iAMT) Have your assets discover each other (iAMT) Internet Overlay Distributed systems is all about overlays

4 Timothy Roscoe Intel Research 4 If only it werent so hard In theory In theory Figure out right properties Figure out right properties Get the algorithms and protocols Get the algorithms and protocols Implement them Implement them Tune them Tune them Test them Test them Debug them Debug them Repeat Repeat But in practice But in practice No global view No global view Wrong choice of algorithms Wrong choice of algorithms Incorrect implementation Incorrect implementation Pathological timeouts Pathological timeouts Partial failures Partial failures Impaired introspection Impaired introspection Homicidal boredom Homicidal boredom Next to no debug support Next to no debug support Its hard enough as it is Do I also need to reinvent the wheel every time?

5 Timothy Roscoe Intel Research 5 Our ultimate goal Make network development more accessible to developers of distributed applications Make network development more accessible to developers of distributed applications Specify network at a high-level Specify network at a high-level Automatically translate specification into executable Automatically translate specification into executable Hide everything they dont want to touch Hide everything they dont want to touch Enjoy performance that is good enough Enjoy performance that is good enough Do for networked systems what SQL and the relational model did for databases Do for networked systems what SQL and the relational model did for databases

6 Timothy Roscoe Intel Research 6 The argument: The set of routing tables in a network represents a distributed data structure The set of routing tables in a network represents a distributed data structure The data structure is characterized by a set of ideal properties which define the network The data structure is characterized by a set of ideal properties which define the network Thinking in terms of structure, not protocol Thinking in terms of structure, not protocol Routing is the process of maintaining these properties in the face of changing ground facts Routing is the process of maintaining these properties in the face of changing ground facts Failures, topology changes, load, policy… Failures, topology changes, load, policy…

7 Timothy Roscoe Intel Research 7 Routing as Query Processing In database terms, the routing table is a view over changing network conditions and state In database terms, the routing table is a view over changing network conditions and state Maintaining it is the domain of distributed continuous query processing Maintaining it is the domain of distributed continuous query processing Not merely an analogy: We have implemented a general routing protocol engine as a query processor. Not merely an analogy: We have implemented a general routing protocol engine as a query processor. Dataflow elements provide an implementation model for queries Dataflow elements provide an implementation model for queries Overlays can be written in a high-level query language Overlays can be written in a high-level query language

8 Timothy Roscoe Intel Research 8 Two directions 1.Declarative expression of Internet Routing protocols Loo et. al., ACM SIGCOMM 2005Loo et. al., ACM SIGCOMM 2005 2.Declarative implementation of overlay networks Loo et. al., ACM SOSP 2005Loo et. al., ACM SOSP 2005 The focus of this talk (and my work)The focus of this talk (and my work)

9 Timothy Roscoe Intel Research 9 Data model Relational data: tuples and relations Relational data: tuples and relations Two kinds of relation: Two kinds of relation: Distributed soft state in relational tables, holding tuples of values Distributed soft state in relational tables, holding tuples of values route (S, D, H) route (S, D, H) Non-stored information passes around as event tuple streams Non-stored information passes around as event tuple streams message (X, D) message (X, D)

10 Timothy Roscoe Intel Research 10 Example: Ring Routing Every node has an address (e.g., IP address) and an identifier (large random) Every node has an address (e.g., IP address) and an identifier (large random) Every object has an identifier Every object has an identifier Order nodes and objects into a ring by their identifiers Order nodes and objects into a ring by their identifiers Objects served by their successor node Objects served by their successor node Every node knows its successor on the ring Every node knows its successor on the ring To find object K, walk around the ring until I locate Ks immediate successor node To find object K, walk around the ring until I locate Ks immediate successor node

11 Timothy Roscoe Intel Research 11 Example: Ring Routing How do I find the responsible node for a given key k? n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k)

12 Timothy Roscoe Intel Research 12 Ring State n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup ( Addr, Req, K ) response( Addr, K, Owner )

13 Timothy Roscoe Intel Research 13 Pseudocode as a query send response ( Req, K, SAddr ) to Req where lookup ( NAddr, Req, K ) @ NAddr and node ( NAddr, N ), and succ ( NAddr, Succ, SAddr ), andK in ( N, Succ ], n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup ( Addr, Req, K ) response( Addr, K, Owner )

14 Timothy Roscoe Intel Research 14 Pseudocode as a query send response ( Req, K, SAddr ) to Req where lookup ( NAddr, Req, K ) @ NAddr and node ( NAddr, N ), and succ ( NAddr, Succ, SAddr ), andK in ( N, Succ ], send lookup ( Req, K, SAddr ) to SAddr where lookup ( NAddr, Req, K ) @ Naddr and node ( NAddr, N ), and succ ( NAddr, Succ, SAddr ), andK not in ( N, Succ ], n.lookup(k) if k in (n, n.successor) return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup ( Addr, Req, K ) response( Addr, K, Owner )

15 Timothy Roscoe Intel Research 15 Implementation: From query model to dataflow Traditional problem in databases Traditional problem in databases Turn the logic into relational algebra Turn the logic into relational algebra Joins, projections, selections, aggregations, etc. Joins, projections, selections, aggregations, etc. Implement as graph of software dataflow elements Implement as graph of software dataflow elements C.f. Click, PIER, etc. C.f. Click, PIER, etc. Tuples flow through graphb Tuples flow through graphb Execute this graph to maintain overlay Execute this graph to maintain overlay

16 Timothy Roscoe Intel Research 16 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

17 Timothy Roscoe Intel Research 17 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

18 Timothy Roscoe Intel Research 18 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

19 Timothy Roscoe Intel Research 19 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

20 Timothy Roscoe Intel Research 20 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

21 Timothy Roscoe Intel Research 21 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

22 Timothy Roscoe Intel Research 22 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

23 Timothy Roscoe Intel Research 23 From query to dataflow send response( Req, K, SAddr ) to Req where lookup( NAddr, Req, K ) @ NAddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K in ( N, Succ ] send lookup( Req, K, SAddr ) to SAddr where lookup( NAddr, Req, K ) @ Naddr & node ( NAddr, N ) & succ ( NAddr, Succ, SAddr ) & K not in ( N, Succ ]

24 Timothy Roscoe Intel Research 24 From query to dataflow One strand per subquery Strand order is immaterial Strands could execute in parallel nodesucc Strand 1 lookup response Strand 2 lookup

25 Timothy Roscoe Intel Research 25 From query to dataflow

26 Timothy Roscoe Intel Research 26 P2 Network Overlay Description Packets outPackets in 1. Distributed system specified in a query language 2. Compiled into optimized graph of dataflow elements 3. Graph executed directly to maintain routing tables and network overlay state

27 Timothy Roscoe Intel Research 27 Implementation Elements are C++ objects Elements are C++ objects Reference-counted immutable tuples Reference-counted immutable tuples Fast tuple hand-off Fast tuple hand-off ~50 ia32 instructions, ~300 cycles ~50 ia32 instructions, ~300 cycles Currently single-threaded Currently single-threaded Select loop, timers, etc. Select loop, timers, etc. Element state stored in tables Element state stored in tables C.f. database catalogues: reuse data model wherever appropriate C.f. database catalogues: reuse data model wherever appropriate

28 Timothy Roscoe Intel Research 28 Implementation Extensive library of elements Extensive library of elements Relational operators Relational operators Queues, buffers, schedulers Queues, buffers, schedulers Transport stack (more later) Transport stack (more later) C++ and Python/Tcl bindings C++ and Python/Tcl bindings Allows graph specification as with Click Allows graph specification as with Click But wait – theres more… But wait – theres more…

29 Timothy Roscoe Intel Research 29 Query language Based on Datalog: Based on Datalog: Basied on Prolog with no imperative constructs Basied on Prolog with no imperative constructs Fairly standard query language from literature Fairly standard query language from literature Goals: Goals: Understand language issues Understand language issues Limit constructs as little as possible Limit constructs as little as possible Demonstrate benefits of conciseness Demonstrate benefits of conciseness Non-goals (at this stage): Non-goals (at this stage): A nice language to write in (as we will see) A nice language to write in (as we will see) Clean semantics (though we now have some) Clean semantics (though we now have some) Truly high-level, global property specification Truly high-level, global property specification

30 Timothy Roscoe Intel Research 30 Datalog and location specifiers n.lookup(k) if k in (n, n.successor] return n.successor else return n.successor. lookup(k) Node state tuples node(NAddr, N) successor(NAddr, Succ, SAddr) Transient event tuples lookup (NAddr, Req, K) R1 response@Req(Req, K, SAddr) :- lookup@NAddr(NAddr, Req, K), node@NAddr(NAddr, N), succ@NAddr(NAddr, Succ, SAddr), K in (N, Succ]. R2 lookup@SAddr(SAddr, Req, K) :- lookup@NAddr(NAddr, Req, K), node@NAddr(NAddr, N), succ@NAddr(NAddr, Succ, SAddr), K not in (N, Succ].

31 Timothy Roscoe Intel Research 31 It actually works. For instance, we implemented Chord in P2 For instance, we implemented Chord in P2 Popular distributed hash table Popular distributed hash table Complex overlay Complex overlay Dynamic maintenance Dynamic maintenance How do we know it works? How do we know it works? Same high-level properties Same high-level properties Logarithmic diameter & state Logarithmic diameter & state Consistent routing with churn Consistent routing with churn Property checks as additional queries Property checks as additional queries Comparable performance to hand-coded implementations Comparable performance to hand-coded implementations

32 Timothy Roscoe Intel Research 32 Key point: remarkably concise overlay specification Full specification of Chord overlay, including Full specification of Chord overlay, including Failure recovery Failure recovery Multiple successors Multiple successors Stabilization Stabilization Optimized maintenance Optimized maintenance 44 OverLog rules 44 OverLog rules And it runs! And it runs! 10 pt font

33 Timothy Roscoe Intel Research 33 Comparison: MIT Chord in C++

34 Timothy Roscoe Intel Research 34 Lookup length in hops

35 Timothy Roscoe Intel Research 35 Maintenance bandwidth (comparable with MIT Chord)

36 Timothy Roscoe Intel Research 36 Latency without churn

37 Timothy Roscoe Intel Research 37 Latency under churn Compare with Bamboo non- adaptive timeout figures…

38 Timothy Roscoe Intel Research 38 Consistency under churn

39 Timothy Roscoe Intel Research 39 The story so far: Can specify overlays as continuous queries in a logic language Can specify overlays as continuous queries in a logic language Compile to a graph of dataflow elements Compile to a graph of dataflow elements Efficiently execute graph to perform routing and forwarding Efficiently execute graph to perform routing and forwarding Overlays exhibit similar performance characteristics Overlays exhibit similar performance characteristics But … Once you have a distributed query processor, lots of things fall off the back of the truck…

40 Timothy Roscoe Intel Research 40 What else does this buy you? Introspection w/ Atul Singh (Rice) & Peter Druschel (MPI) Overlay invariant monitoring: a distributed watchpoint Overlay invariant monitoring: a distributed watchpoint Whats the average path length? Whats the average path length? Is routing consistent? Is routing consistent? Execution tracing at pseudo-code granularity: logical stepping Execution tracing at pseudo-code granularity: logical stepping Why did rule R7 trigger? Why did rule R7 trigger? … and at dataflow granularity: intermediate representation stepping … and at dataflow granularity: intermediate representation stepping Why did that tuple expire? Why did that tuple expire? Great way to do distributed debugging and logging Great way to do distributed debugging and logging In fact, we use it and have found a number of bugs… In fact, we use it and have found a number of bugs…

41 Timothy Roscoe Intel Research 41 What else does this buy you? 2. Transport reconfiguration Dataflow paradigm thins out layer boundaries Dataflow paradigm thins out layer boundaries Mix and match transport facilities (retries, congestion control, rate limitation, buffering) Mix and match transport facilities (retries, congestion control, rate limitation, buffering) Spread bits of transport through the application to suit application requirements Spread bits of transport through the application to suit application requirements Automatically! Automatically!

42 Timothy Roscoe Intel Research 42 In fact, a rich seam for future research… Reconfigurable transport protocols Reconfigurable transport protocols Debugging and logging support Debugging and logging support The right language – global invariants The right language – global invariants Use distributed joins as abstraction mechanism Use distributed joins as abstraction mechanism Optimization techniques Optimization techniques Inc. multiquery optimization Inc. multiquery optimization Monitoring other distributed systems and networks Monitoring other distributed systems and networks Evolve towards more general query processor? Evolve towards more general query processor? PIER heritage returns PIER heritage returns

43 Timothy Roscoe Intel Research 43 Summary Overlays are distributed system innovation Overlays are distributed system innovation Wed better make them easier to build, reuse, understand Wed better make them easier to build, reuse, understand P2 enables P2 enables High-level overlay specification in OverLog High-level overlay specification in OverLog Automatic translation of specification into dataflow graph Automatic translation of specification into dataflow graph Execution of dataflow graph Execution of dataflow graph Explore and Embrace the trade-off between fine-tuning and ease of development Explore and Embrace the trade-off between fine-tuning and ease of development Get the full immersion treatment in our paper in SOSP 05, code release imminent Get the full immersion treatment in our paper in SOSP 05, code release imminent

44 Timothy Roscoe Intel Research 44 Thanks! Questions? A few to get you started: A few to get you started: Who cares about overlays? Who cares about overlays? Logic? You mean Prolog? Eeew! Logic? You mean Prolog? Eeew! This language is really ugly. Discuss. This language is really ugly. Discuss. But what about security? But what about security? Is anyone ever going to use this? Is anyone ever going to use this? Is this as revolutionary and inspired as it looks? Is this as revolutionary and inspired as it looks?http://P2.berkeley.intel-research.net


Download ppt "Intel Research Timothy Roscoe P2: Implementing Declarative Overlays Timothy Roscoe Boon Thau Loo, Tyson Condie, Petros Maniatis, Ion Stoica, David Gay,"

Similar presentations


Ads by Google