Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud-Scale Event Processing Using Rx Bart J.F. De Smet Principal Software Engineer Microsoft Corporation.

Similar presentations


Presentation on theme: "Cloud-Scale Event Processing Using Rx Bart J.F. De Smet Principal Software Engineer Microsoft Corporation."— Presentation transcript:

1 Cloud-Scale Event Processing Using Rx Bart J.F. De Smet Principal Software Engineer Microsoft Corporation

2 Who, Where, When? Born in Belgium Computer Science Engineering at University of Ghent In the “civil” engineering department Most Valuable Professional (MVP) for C# Joined Microsoft in 2007.NET Framework / WPF team member ( ) Working on Application Model SQL Division ( ) Working on Reactive Extensions with Erik Meijer et al Online Services Division (2012-…) Cloud-scale event processing using Rx, powering Cortana etc. Belgian beers (really) Exiled to Building 10

3 Who, Where, When? Author C# Unleashed series at SAMS Pluralsight courses (pluralsight.com)pluralsight.com Speaker TechEd, TechDays, //build/, PDC, etc. Channel 9 videos (channel9.msdn.com)channel9.msdn.com Goto, JAOO, etc. Physics lover Beauty of dualities, symmetries, etc. Power of diagrams, formalism, etc.

4 Reactive Extensions What, why, how?

5 An Accidental Discovery Cloud Programmability Team “Oasis” within Live Labs and later the SQL Server organization Founded by Erik Meijer and Brian Beckman Code-named “Tesla” after Nikola Tesla (the electric car didn’t exist yet) Founded in the mid 2000s Ray Ozzie ages Making sense of this new thing called “cloud” Various projects IL2JS – a compiler from IL to JavaScript Extension to JavaScript with classes, modules, types Embarrassingly distributed build system for the cloud Reactive Extensions aka “LINQ to Events” Nikola Tesla

6 An Accidental Discovery Project “Volta” Tier-splitting of applications Write as single-tier.NET application using metadata annotations Attributes like [RunOnClient] Cross-compilation of code to match client capabilities Desktop CLR or Silverlight when available IL to JavaScript when necessary Even compiling Windows Forms controls to HTML No promises or futures The world of Begin/End where Task had yet to be invented async/await was unheard of (C# 5.0) But the web is asynchronous…

7 An Accidental Discovery Project “Volta” Dealing with asynchrony across tiers Ultimately needs to cross-compile to AJAX Events are not first-class objects Can’t transport them across tiers [RunOnClient] public event EventHandler MouseMoved; // Runs in cloud public void CloudCanvas() { MouseMoved += (o, e) => { /* do stuff */ }; } An electric eel…

8 Making Events First-Class First-class concepts have object representations Methods can be transported using delegates But properties, indexers, and events are metadata citizens Action a = new Action(Foo); // explicit creation of delegate instance Action b = Foo; // method group conversion Action c = () => { … }; // creates anonymous method void Foo() { … } event Action Bar // metadata that refers to … { add { … } // add accessor remove { … } // remove accessor }

9 Fundamental Abstractions Adapting the observer pattern Ensuring duality with the enumerator pattern More compositional approach interface IObservable { IDisposable Subscribe(IObserver observer); } interface IObserver { void OnNext(T value); void OnError(Exception error); void OnCompleted(); } Notification grammar OnNext* (OnError | OnCompleted)? “Gang of four” book Addison-Wesley

10 Highly Compositional LINQ-style query operators over IObservable Composition of 0-N input sequences Composition of disposable subscriptions and scheduler resources interface IScheduler { IDisposable Schedule(Action work); … } static class Observable { static IObservable Where (this IObservable source, Func f); static IObservable Select (this IObservable source, Func p); … } Function composition

11 Highly Compositional Building a binary Merge operator public static class Observable { public static IObservable Merge (this IObservable xs, IObservable ys) { return Create (observer => { var gate = new object(); return new CompositeDisposable { xs.Subscribe(x => { lock (gate) { observer.OnNext(x); } }, …), ys.Subscribe(y => { lock (gate) { observer.OnNext(y); } }, …), }; }); } } First-class: can build extension methods Composition of resource management Function composition

12 The Role of Schedulers Pure architectural layering of the system Logical query operators (~ relational engine) Physical schedulers (~ operating system) Abstract over sources of asynchrony and time Threads, thread pools, tasks, message loops DateTime.Now, timers Enable virtual time testing public static IObservable Return (T value, IScheduler scheduler) { return Create (obs => scheduler.Schedule(() => { obs.OnNext(value); obs.OnCompleted(); })); } Space-time

13 The Beauty of Duality Category theory to the rescue (Bierman, Meijer) Observable/observer (push) is dual to enumerable/enumerator (pull) Cross-influence of both domains interface IObservable { IDisposable Subscribe(IObserver observer); } interface IObserver { void OnNext(T value); void OnError(Exception error); void OnCompleted(); } interface IEnumerable { IEnumerator GetEnumerator(); } interface IEnumerator : IDisposable { bool MoveNext() throws Exception; T Current { get; } void Reset(); } Category theory

14 One versus Many When Task and Task were born… Single-value specializations Await-ability of sequences (for aggregates) Func f var x = wait f(); IEnumerable xs foreach (var x in xs) { f(x); } Task t var x = await t; IObservable xs xs.Subscribe(x => { f(x); }); One Many Synchronous Asynchronous Stephen Kleene Kleene star (closure)

15 Customizable Execution Plans Capturing user intent using expression trees Code-as-data Homoiconicity = same syntax E.g. LINQ to Twitter IEnumerable xs from x in xs where f(x) … IQueryable xs from x in xs where f(x) … IObservable xs from x in xs where f(x) … IQbservable xs from x in xs where f(x) … Code Data PullPush Alan Kay Homoiconicity, 1969

16 Taking Rx to the Cloud Large-scale distributed event processing in Bing

17 “Cloud First, Mobile First” Events are all around us Classic frameworks for UI programming Sensors in phones, IoT scenarios, etc. Monitoring of systems and cloud infrastructures Changes to the world’s information Building a scalable abstraction that allows for: Hiding of concrete implementations Distributed and intelligent execution Compute close to data Satya Nadella “Cloud-first, mobile-first”

18 Going Where the Data is New reactive event processing effort Founded in Bing around 2012 Standing on the shoulders of (relevant) giants Massive amounts of data available Powering scenarios for “Cortana” Track updates to flights, weather, sport scores, news etc. Remind me when to leave based on traffic conditions Cloud and device Capturing the world’s information as real-time streams Abstracting device sensors (GPS etc.) as streams <3

19 Observations on Cloud and Devices Optimization for resources CPU + memory = power Density of computation in the cloud (# of standing queries / machine) Affordability as background service on devices with < 1GB of RAM Reliability of computations Device-side services Subject to tombstoning (cf. application lifecycle) Can run out of battery Cloud-side services Outages of compute nodes are the way of life Deployment causes intentional failover of services Richard Feynman Westview Press

20 Some (Simplified) Cortana Queries from w in weather where w.City == “Seattle” select w flights.Where(f => f.Code == “BA49”).DelaySubscription(departure – 24 * H).TakeUntil(arrival + 24 * H).Select(f => f.Status) userLocation.Sample(1 * M).DistinctUntilChanged().Select(here => traffic(here, meetingLocation).TakeUntil(meeting – minimumTime).Select(t => Timer(meeting – t.EstimatedTime)).StartWith(Timer(meeting – estimatedTime)).Switch()).Switch().Select(_ => “Leave now for ” + subject).DelaySubscription(meeting – estimatedTime) Temporal query operators Device-side event stream // Insert cloud-side observable sequence // of time-to-leave timer here Cloud-side event stream Higher-order query operators Remember tier-splitting?

21 Bing cloud IRP High-Level Overview IReactiveProcessing (IRP) abstraction Representation of an event processing service Highly interoperable between devices and services Russian doll / turtles all the way down model Partner 1 cloud IRP Partner 2 cloud IRP Tablet IRP Phone IRP Node IRP Node IRP Node IRP

22 Scaling the Abstractions Lessons learned from Rx Enhanced phasing of subscription lifecycle Turning subjects into streams Building the “Standard Model” of event processing Addressing distributed system concerns Identification of artifacts Latencies and asynchrony Reliability of computation and messages Deployment friendliness Elasticity and dynamic topologies Management of system resources

23 Adapting Rx Building an execution engine

24 Event Processing Execution Engine How to leverage Rx? Rich library of query operators, including temporal ones Additional requirements Persist and recovery state (e.g. aggregates) High density (> millions of subscriptions per machine) Don’t miss input events, even when node is down Scalable egress (e.g. notification platforms) Reliability approach Periodic checkpointing of state to persistent storage Acknowledge / replay of ingress messages At-least-once processing guarantees De-duplication can be layered on top

25 Distributed Event Processing Coordinator distributes query fragments to engines over IRP Protocol between engines Event Processing Execution Engine Node 1 Rx IRP Event Processing Execution Engine Node 2 Rx IRP Checkpoint storage Save Load Checkpoint storage Save Load Reliable messaging OnNext Replay Ack

26 Revisiting IObservable Composition in Rx can be hairy due to race conditions Subscribe makes the sequence “hot” Callbacks can happen at any time, even before IDisposable is returned public static IObservable Take (this IObservable xs, int count) { return Create (observer => { var remaining = count; return xs.Subscribe(x => { observer.OnNext(x); if (--remaining == 0) { observer.OnCompleted(); // how to dispose? } }, …); }); }

27 Revisiting IObservable Inventions a la SingleAssignmentDisposable (SAD) public static IObservable Take (this IObservable xs, int count) { return Create (observer => { var remaining = count; var  = new SingleAssignmentDisposable(); .Disposable = xs.Subscribe(x => { observer.OnNext(x); if (--remaining == 0) { observer.OnCompleted(); .Dispose(); } }, …); return  ; }); }

28 Revisiting IObservable Need to be able to traverse the operator tree for various purposes Provide context to operators, e.g. loggers, resource managers, etc. Hide schedulers from users by “flowing” them Visit stateful nodes to persist / load state for checkpointing Limitations of current Rx Subscribe combines “Attach” and “Start” lifecycle states Dispose combines “Stop” and “Detach” lifecycle state Generalization using ISubscribable Enable traversal of operator trees interface ISubscribable { ISubscription Subscribe(IObserver observer); } interface Isubscription : IDisposable { void Accept(ISubscriptionVisitor v); }

29 Revisiting IObservable Core operator library is built using ISubscribable public static ISubscribable Take (this ISubscribable xs, int count) { return new Take (xs, count); } class Take : SubscribableBase { … public ISubscription Subscribe(IObserver observer) { return new Impl(this, observer); } class Impl : Operator, IStatefulOperator { // interfaces for visitors protected override void OnStart() { … } … } }

30 Revisiting IScheduler Rx scheduler benefits Abstract over time & allow for virtual time Hide different sources of concurrency Evolving schedulers Physical schedulers Own operating system resources Try to achieve ideal degree of concurrency Compositional approach to logical schedulers Logical child schedulers can be created Unit of pausing used for checkpointing Pause / resume a la GC Ensures stable operator state

31 Revisiting ISubject What’s the dual of an ISubject ? Notice the variance of the base interfaces… …but if we flip things around, data cannot “flow”… Problems Using an “is-a” relationship versus a “has-a” relationship Cannot have multiple producers (input observers) to a subject interface ISubject : IObservable, IObserver } interface ITcejbus : IEnumerable, IEnumerator }

32 Revisiting ISubject Introduction of the IMultiSubject Provides a way to get many producers Subject implementation decides on policy to “merge” Resolves the input/output conundrum Dual abstraction in the enumerable world makes sense See Interactive Extensions (Ix)’s IBuffer Left as an exercise interface IMultiSubject : IObservable { IObserver GetObserver(); }

33 The IRP Programming Model Super-symmetric designs

34 A Hyper-Triad of Interfaces IReactiveProcessing provides Proxies – source of composition of artifacts (DML) Definitions – enables definition of artifacts (DDL) Metadata – catalog of artifacts for discoverability Super-symmetric design Same interface exists in many “spaces” Hyper-cube of parallel worlds Synchronous versus asynchronous Extrinsic versus intrinsic identities Code-as-data versus code Reliable versus non-reliable Etc. SyncAsync Intrinsic Extrinsic Code Data Higher dimensions Calabi-Yau manifold

35 A Hyper-Triad of Interfaces IReactiveProcessing provides Proxies – source of composition of artifacts (DML) Definitions – enables definition of artifacts (DDL) Metadata – catalog of artifacts for discoverability Super-symmetric design Same interface exists in many “spaces” Hyper-cube of parallel worlds Synchronous versus asynchronous Extrinsic versus intrinsic identities Code-as-data versus code Reliable versus non-reliable Etc. Async Extrinsic Data E.g. this is the client-side API Need string theory Higher dimensions Calabi-Yau manifold

36 Enabling Composition through Proxies It all starts with a logical “context” Similar to LINQ to SQL’s DataContext Parameterized by a physical connection To a cloud service, or to a device, etc. Deals with authentication, various knobs, etc. Family of Get* methods to obtain proxies to artifacts Observables, observers, etc. var conn = new ReactiveServiceConnection(endpoint); var ctx = new ClientContext(conn); var traffic = ctx.GetObservable (trafficUri); var http = ctx.GetObserver (httpUri); Explicit identifiers provided for artifacts

37 Enabling Composition through Proxies Get* methods do not go to the IRP system Just obtain local proxies to remote artifacts Artifacts have extrinsic identifiers, specified as URIs Async methods go to the IRP system E.g. SubscribeAsync, OnNextAsync, etc. Example var traffic = ctx.GetObservable (trafficUri); var http = ctx.GetObserver (httpUri); var subscription = await traffic.Where(t => t.Road == “I-90”).Select(t => new HttpData { Uri = myService, Body = t.ToString() }).SubscribeAsync(http, mySubUri); Explicit identifiers provided for artifacts

38 Enabling Composition through Proxies What happens when SubscribeAsync is called? User intent is obtained from expression tree IAsyncReactiveQbservable IAsyncReactiveQbserver Etc. Expression tree gets serialized over the wire Normalization of the tree into invocation expressions Language agnostic format (interoperable with weakly typed languages like JavaScript) Dependencies on concrete runtimes a la CLR get stripped out (no type deployment) Payload is sent to target IRP system

39 Enabling Composition through Proxies var traffic = ctx.GetObservable (trafficUri); var http = ctx.GetObserver (httpUri); var subscription = await traffic.Where(t => t.Road == “I-90”).Select(t => new HttpData { Uri = …, Body = … }).SubscribeAsync(http, mySubUri); Invoke rx://subscribe Invoke rx://select Invoke rx://where trafficUri httpUri λ t => new { bing://http/uri = …, bing://http/body = … } λ t => t.Road == “I-90” Unbound parameters processed by binder Structural typing to reduce coupling

40 Enabling Abstraction through Definitions Define* methods allow for definition of artifacts Observables, observers, etc. Much like stored procedures, user-defined functions, etc. Also identified by URIs var traffic = ctx.GetObservable (trafficUri); // Define await ctx.DefineObservableAsync ( trafficByRoadUri, road => traffic.Where(t => t.Road == road)); // Proxies var trafficByRoad = ctx.GetObservable (trafficByRoadUri); trafficByRoad(“I-90”).…

41 Rich Mapping Capabilities No built-in artifacts Query operators are parameterized observables Extension methods provide an illusion This fluent pattern… … is the same as var xs = ctx.GetObservable (xsUri); var iv = ctx.GetObserver (ivUri); await xs.Where(x => x > 0).Select(x => x * x).SubscribeAsync(iv, …); var xs = ctx.GetObservable (xsUri); var iv = ctx.GetObserver (ivUri); var where = ctx.GetObservable, Expr >, int>(whereUri); var select = ctx.GetObservable, Expr >, int>(selectUri); await select(where(xs, x => x > 0), x => x * x).SubscribeAsync(iv, …);

42 Rich Mapping Capabilities [KnownResource] attributes can be used everywhere Provide shorthand syntax for Get* operations Enable code generation using the Metadata facilities What’s IAsyncReactiveQbservable ? Async because it’s client-side (cf. SubscribeAsync) Reactive because we had to disambiguate with Rx concepts Qbservable because it’s expression tree based static class AsyncReactiveQbservable { [KnownResource(whereUri)] static IAsyncReactiveQbservable Where (this IAsyncReactiveQbservable xs, Expression > filter) {…} }

43 Enabling Discovery through Metadata Series of properties for different artifact types Observables, observers, subscriptions, etc. IQueryableDictionary allows for structured (LINQ) queries Enables tooling “RxMetal” a la “SqlMetal” to generate a derived ClientContext Artifact explorers a la Server Explorer in Visual Studio Enables delegation IRP systems can discover capabilities across boundaries var traffic = ctx.Observables[trafficUri]; var operators = ctx.Observables.Where(kv => kv.Key.StartsWith(“rx://”));

44 Standard Model of Event Processing What are the reactive artifact types? Division based on temperature Cold = potential energy; to be instantiated Hot = kinetic energy; active artifacts Kinds of artifacts familiar from Rx Cold artifacts can be parameterized ColdOperationHot ObservableSubscribeAsyncSubscription (Our Higgs boson)CreateAsyncObserver Subject factoryCreateAsyncSubject Standard Model

45 Demo Quick example of using IRP

46 What’s Next? Starting the conversation on Rx v3.0

47 Rx v3.0 Peeling the onion of our system Many IRP implementations in Bing But “IRP Core” is… Built in a reusable fashion Strict separation of logical and physical What, when, where is TBD Operator library with checkpointing etc. Service abstractions Data model and data / expression serialization stacks Query engine components Give us feedback! Service Host Data model Query engine Operators

48 Q&A Thanks!


Download ppt "Cloud-Scale Event Processing Using Rx Bart J.F. De Smet Principal Software Engineer Microsoft Corporation."

Similar presentations


Ads by Google