Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud-Scale Event Processing Using Rx

Similar presentations


Presentation on theme: "Cloud-Scale Event Processing Using Rx"— Presentation transcript:

1 Cloud-Scale Event Processing Using Rx
Bart J.F. De Smet Principal Software Engineer Microsoft Corporation

2 Who, Where, When? Born in Belgium Joined Microsoft in 2007
Computer Science Engineering at University of Ghent In the “civil” engineering department Most Valuable Professional (MVP) for C# Joined Microsoft in 2007 .NET Framework / WPF team member ( ) Working on Application Model SQL Division ( ) Working on Reactive Extensions with Erik Meijer et al Online Services Division (2012-…) Cloud-scale event processing using Rx, powering Cortana etc. Belgian beers (really) Exiled to Building 10

3 Who, Where, When? Author Speaker Physics lover
C# Unleashed series at SAMS Pluralsight courses (pluralsight.com) Speaker TechEd, TechDays, //build/, PDC, etc. Channel 9 videos (channel9.msdn.com) Goto, JAOO, etc. Physics lover Beauty of dualities, symmetries, etc. Power of diagrams, formalism, etc.

4 Reactive Extensions What, why, how?

5 An Accidental Discovery
Cloud Programmability Team “Oasis” within Live Labs and later the SQL Server organization Founded by Erik Meijer and Brian Beckman Code-named “Tesla” after Nikola Tesla (the electric car didn’t exist yet) Founded in the mid 2000s Ray Ozzie ages Making sense of this new thing called “cloud” Various projects IL2JS – a compiler from IL to JavaScript Extension to JavaScript with classes, modules, types Embarrassingly distributed build system for the cloud Reactive Extensions aka “LINQ to Events” Nikola Tesla

6 An Accidental Discovery
Project “Volta” Tier-splitting of applications Write as single-tier .NET application using metadata annotations Attributes like [RunOnClient] Cross-compilation of code to match client capabilities Desktop CLR or Silverlight when available IL to JavaScript when necessary Even compiling Windows Forms controls to HTML No promises or futures The world of Begin/End where Task<T> had yet to be invented async/await was unheard of (C# 5.0) But the web is asynchronous…

7 An Accidental Discovery
Project “Volta” Dealing with asynchrony across tiers Ultimately needs to cross-compile to AJAX Events are not first-class objects Can’t transport them across tiers An electric eel… [RunOnClient] public event EventHandler<MouseEventArgs> MouseMoved; // Runs in cloud public void CloudCanvas() { MouseMoved += (o, e) => { /* do stuff */ }; }

8 Making Events First-Class
Making Events First-Class First-class concepts have object representations Methods can be transported using delegates But properties, indexers, and events are metadata citizens Action a = new Action(Foo); // explicit creation of delegate instance Action b = Foo; // method group conversion Action c = () => { … }; // creates anonymous method void Foo() { … } event Action Bar // metadata that refers to … { add { … } // add accessor remove { … } // remove accessor }

9 Fundamental Abstractions
Adapting the observer pattern Ensuring duality with the enumerator pattern More compositional approach “Gang of four” book Addison-Wesley interface IObservable<out T> { IDisposable Subscribe(IObserver<T> observer); } interface IObserver<in T> void OnNext(T value); void OnError(Exception error); void OnCompleted(); Notification grammar OnNext* (OnError | OnCompleted)?

10 Highly Compositional Function composition LINQ-style query operators over IObservable<T> Composition of 0-N input sequences Composition of disposable subscriptions and scheduler resources static class Observable { static IObservable<T> Where<T>(this IObservable<T> source, Func<T, bool> f); static IObservable<R> Select<T, R>(this IObservable<T> source, Func<T, R> p); } interface IScheduler { IDisposable Schedule(Action work); }

11 Highly Compositional Building a binary Merge operator
Function composition Building a binary Merge operator First-class: can build extension methods public static class Observable { public static IObservable<T> Merge<T>(this IObservable<T> xs, IObservable<T> ys) return Create<T>(observer => var gate = new object(); return new CompositeDisposable xs.Subscribe(x => { lock (gate) { observer.OnNext(x); } }, …), ys.Subscribe(y => { lock (gate) { observer.OnNext(y); } }, …), }; }); } Composition of resource management

12 Space-time www.Wikipedia.org
The Role of Schedulers Space-time Pure architectural layering of the system Logical query operators (~ relational engine) Physical schedulers (~ operating system) Abstract over sources of asynchrony and time Threads, thread pools, tasks, message loops DateTime.Now, timers Enable virtual time testing public static IObservable<T> Return<T>(T value, IScheduler scheduler) { return Create<T>(obs => scheduler.Schedule(() => { obs.OnNext(value); obs.OnCompleted(); })); }

13 The Beauty of Duality Category theory to the rescue (Bierman, Meijer)
Category theory to the rescue (Bierman, Meijer) Observable/observer (push) is dual to enumerable/enumerator (pull) Cross-influence of both domains interface IObservable<out T> { IDisposable Subscribe(IObserver<T> observer); } interface IObserver<in T> void OnNext(T value); void OnError(Exception error); void OnCompleted(); interface IEnumerable<out T> { IEnumerator<T> GetEnumerator(); } interface IEnumerator<out T> : IDisposable bool MoveNext() throws Exception; T Current { get; } void Reset();

14 IEnumerable<T> xs IObservable<T> xs
One versus Many When Task and Task<T> were born… Single-value specializations Await-ability of sequences (for aggregates) Stephen Kleene Kleene star (closure) Synchronous Asynchronous Func<T> f var x = wait f(); Task<T> t var x = await t; One IEnumerable<T> xs foreach (var x in xs) { f(x); } IObservable<T> xs xs.Subscribe(x => { f(x); }); Many

15 Customizable Execution Plans
Alan Kay Homoiconicity, 1969 Capturing user intent using expression trees Code-as-data Homoiconicity = same syntax E.g. LINQ to Twitter Pull Push IEnumerable<T> xs from x in xs where f(x) … IObservable<T> xs from x in xs where f(x) … Code IQueryable<T> xs from x in xs where f(x) … IQbservable<T> xs from x in xs where f(x) … Data

16 Taking Rx to the Cloud Large-scale distributed event processing in Bing

17 “Cloud First, Mobile First”
Satya Nadella “Cloud-first, mobile-first” Events are all around us Classic frameworks for UI programming Sensors in phones, IoT scenarios, etc. Monitoring of systems and cloud infrastructures Changes to the world’s information Building a scalable abstraction that allows for: Hiding of concrete implementations Distributed and intelligent execution Compute close to data

18 Going Where the Data is <3 New reactive event processing effort
Founded in Bing around 2012 Standing on the shoulders of (relevant) giants Massive amounts of data available Powering scenarios for “Cortana” Track updates to flights, weather, sport scores, news etc. Remind me when to leave based on traffic conditions Cloud and device Capturing the world’s information as real-time streams Abstracting device sensors (GPS etc.) as streams

19 Observations on Cloud and Devices
Optimization for resources CPU + memory = power Density of computation in the cloud (# of standing queries / machine) Affordability as background service on devices with < 1GB of RAM Reliability of computations Device-side services Subject to tombstoning (cf. application lifecycle) Can run out of battery Cloud-side services Outages of compute nodes are the way of life Deployment causes intentional failover of services Richard Feynman Westview Press

20 Some (Simplified) Cortana Queries
from w in weather where w.City == “Seattle” select w flights .Where(f => f.Code == “BA49”) .DelaySubscription(departure – 24 * H) .TakeUntil(arrival + 24 * H) .Select(f => f.Status) Temporal query operators userLocation .Sample(1 * M) .DistinctUntilChanged() .Select(here => traffic(here, meetingLocation) .TakeUntil(meeting – minimumTime) .Select(t => Timer(meeting – t.EstimatedTime)) .StartWith(Timer(meeting – estimatedTime)) .Switch()) .Switch() .Select(_ => “Leave now for ” + subject) .DelaySubscription(meeting – estimatedTime) Device-side event stream Cloud-side event stream // Insert cloud-side observable sequence // of time-to-leave timer here Remember tier-splitting? Higher-order query operators

21 High-Level Overview IReactiveProcessing (IRP) abstraction
Representation of an event processing service Highly interoperable between devices and services Russian doll / turtles all the way down model Partner1 cloud IRP Bing cloud IRP Node IRP Phone IRP Tablet IRP Partner2 cloud IRP

22 Scaling the Abstractions
Lessons learned from Rx Enhanced phasing of subscription lifecycle Turning subjects into streams Building the “Standard Model” of event processing Addressing distributed system concerns Identification of artifacts Latencies and asynchrony Reliability of computation and messages Deployment friendliness Elasticity and dynamic topologies Management of system resources

23 Adapting Rx Building an execution engine

24 Event Processing Execution Engine
How to leverage Rx? Rich library of query operators, including temporal ones Additional requirements Persist and recovery state (e.g. aggregates) High density (> millions of subscriptions per machine) Don’t miss input events, even when node is down Scalable egress (e.g. notification platforms) Reliability approach Periodic checkpointing of state to persistent storage Acknowledge / replay of ingress messages At-least-once processing guarantees De-duplication can be layered on top

25 Distributed Event Processing
Coordinator distributes query fragments to engines over IRP Protocol between engines Event Processing Execution Engine Node1 Rx IRP Event Processing Execution Engine Node2 Rx IRP Reliable messaging OnNext Replay Ack Checkpoint storage Save Load Checkpoint storage Save Load

26 Revisiting IObservable<T>
Composition in Rx can be hairy due to race conditions Subscribe makes the sequence “hot” Callbacks can happen at any time, even before IDisposable is returned public static IObservable<T> Take<T>(this IObservable<T> xs, int count) { return Create<T>(observer => { var remaining = count; return xs.Subscribe(x => { observer.OnNext(x); if (--remaining == 0) { observer.OnCompleted(); // how to dispose? } }, …); });

27 Revisiting IObservable<T>
Inventions a la SingleAssignmentDisposable (SAD) public static IObservable<T> Take<T>(this IObservable<T> xs, int count) { return Create<T>(observer => { var remaining = count; var  = new SingleAssignmentDisposable(); .Disposable = xs.Subscribe(x => { observer.OnNext(x); if (--remaining == 0) { observer.OnCompleted(); .Dispose(); } }, …); return ; });

28 Revisiting IObservable<T>
Need to be able to traverse the operator tree for various purposes Provide context to operators, e.g. loggers, resource managers, etc. Hide schedulers from users by “flowing” them Visit stateful nodes to persist / load state for checkpointing Limitations of current Rx Subscribe combines “Attach” and “Start” lifecycle states Dispose combines “Stop” and “Detach” lifecycle state Generalization using ISubscribable<T> Enable traversal of operator trees interface ISubscribable<out T> { ISubscription Subscribe(IObserver<T> observer); } interface Isubscription : IDisposable { void Accept(ISubscriptionVisitor v); }

29 Revisiting IObservable<T>
Core operator library is built using ISubscribable<T> public static ISubscribable<T> Take<T>(this ISubscribable<T> xs, int count) { return new Take<T>(xs, count); } class Take<T> : SubscribableBase<T> { public ISubscription Subscribe(IObserver<T> observer) { return new Impl(this, observer); class Impl : Operator, IStatefulOperator { // interfaces for visitors protected override void OnStart() { … }

30 Revisiting IScheduler
Rx scheduler benefits Abstract over time & allow for virtual time Hide different sources of concurrency Evolving schedulers Physical schedulers Own operating system resources Try to achieve ideal degree of concurrency Compositional approach to logical schedulers Logical child schedulers can be created Unit of pausing used for checkpointing Pause / resume a la GC Ensures stable operator state

31 Revisiting ISubject<T>
What’s the dual of an ISubject<T>? Notice the variance of the base interfaces… …but if we flip things around, data cannot “flow”… Problems Using an “is-a” relationship versus a “has-a” relationship Cannot have multiple producers (input observers) to a subject interface ISubject<T> : IObservable<out T>, IObserver<in T> } interface ITcejbus<T> : IEnumerable<out T>, IEnumerator<out T> }

32 Revisiting ISubject<T>
Introduction of the IMultiSubject<T> Provides a way to get many producers Subject implementation decides on policy to “merge” Resolves the input/output conundrum Dual abstraction in the enumerable world makes sense See Interactive Extensions (Ix)’s IBuffer<T> Left as an exercise interface IMultiSubject<T> : IObservable<out T> { IObserver<T> GetObserver(); }

33 The IRP Programming Model
Super-symmetric designs

34 A Hyper-Triad of Interfaces
Higher dimensions Calabi-Yau manifold IReactiveProcessing provides Proxies – source of composition of artifacts (DML) Definitions – enables definition of artifacts (DDL) Metadata – catalog of artifacts for discoverability Super-symmetric design Same interface exists in many “spaces” Hyper-cube of parallel worlds Synchronous versus asynchronous Extrinsic versus intrinsic identities Code-as-data versus code Reliable versus non-reliable Etc. Data Code Sync Async Intrinsic Extrinsic

35 A Hyper-Triad of Interfaces
Higher dimensions Calabi-Yau manifold IReactiveProcessing provides Proxies – source of composition of artifacts (DML) Definitions – enables definition of artifacts (DDL) Metadata – catalog of artifacts for discoverability Super-symmetric design Same interface exists in many “spaces” Hyper-cube of parallel worlds Synchronous versus asynchronous Extrinsic versus intrinsic identities Code-as-data versus code Reliable versus non-reliable Etc. E.g. this is the client-side API Data Extrinsic Need string theory  Async

36 Enabling Composition through Proxies
It all starts with a logical “context” Similar to LINQ to SQL’s DataContext Parameterized by a physical connection To a cloud service, or to a device, etc. Deals with authentication, various knobs, etc. Family of Get* methods to obtain proxies to artifacts Observables, observers, etc. var conn = new ReactiveServiceConnection(endpoint); var ctx = new ClientContext(conn); Explicit identifiers provided for artifacts var traffic = ctx.GetObservable<TrafficInfo>(trafficUri); var http = ctx.GetObserver<HttpData>(httpUri);

37 Enabling Composition through Proxies
Get* methods do not go to the IRP system Just obtain local proxies to remote artifacts Artifacts have extrinsic identifiers, specified as URIs Async methods go to the IRP system E.g. SubscribeAsync, OnNextAsync, etc. Example Explicit identifiers provided for artifacts var traffic = ctx.GetObservable<TrafficInfo>(trafficUri); var http = ctx.GetObserver<HttpData>(httpUri); var subscription = await traffic.Where(t => t.Road == “I-90”) .Select(t => new HttpData { Uri = myService, Body = t.ToString() }) .SubscribeAsync(http, mySubUri);

38 Enabling Composition through Proxies
What happens when SubscribeAsync is called? User intent is obtained from expression tree IAsyncReactiveQbservable<T> IAsyncReactiveQbserver<T> Etc. Expression tree gets serialized over the wire Normalization of the tree into invocation expressions Language agnostic format (interoperable with weakly typed languages like JavaScript) Dependencies on concrete runtimes a la CLR get stripped out (no type deployment) Payload is sent to target IRP system

39 Enabling Composition through Proxies
var traffic = ctx.GetObservable<TrafficInfo>(trafficUri); var http = ctx.GetObserver<HttpData>(httpUri); var subscription = await traffic.Where(t => t.Road == “I-90”) .Select(t => new HttpData { Uri = …, Body = … }) .SubscribeAsync(http, mySubUri); Unbound parameters processed by binder Invoke Structural typing to reduce coupling rx://subscribe Invoke httpUri t => new { bing://http/uri = …, bing://http/body = … } rx://select Invoke λ rx://where trafficUri λ t => t.Road == “I-90”

40 Enabling Abstraction through Definitions
Define* methods allow for definition of artifacts Observables, observers, etc. Much like stored procedures, user-defined functions, etc. Also identified by URIs var traffic = ctx.GetObservable<TrafficInfo>(trafficUri); // Define await ctx.DefineObservableAsync<string, TrafficInfo>( trafficByRoadUri, road => traffic.Where(t => t.Road == road)); // Proxies var trafficByRoad = ctx.GetObservable<string, TrafficInfo>(trafficByRoadUri); trafficByRoad(“I-90”).…

41 Rich Mapping Capabilities
No built-in artifacts Query operators are parameterized observables Extension methods provide an illusion This fluent pattern… … is the same as var xs = ctx.GetObservable<int>(xsUri); var iv = ctx.GetObserver<int>(ivUri); await xs.Where(x => x > 0).Select(x => x * x).SubscribeAsync(iv, …); var xs = ctx.GetObservable<int>(xsUri); var iv = ctx.GetObserver<int>(ivUri); var where = ctx.GetObservable<IARQ<int>, Expr<Func<int, bool>>, int>(whereUri); var select = ctx.GetObservable<IARQ<int>, Expr<Func<int, int>>, int>(selectUri); await select(where(xs, x => x > 0), x => x * x).SubscribeAsync(iv, …);

42 Rich Mapping Capabilities
[KnownResource] attributes can be used everywhere Provide shorthand syntax for Get* operations Enable code generation using the Metadata facilities What’s IAsyncReactiveQbservable<T>? Async because it’s client-side (cf. SubscribeAsync) Reactive because we had to disambiguate with Rx concepts Qbservable because it’s expression tree based static class AsyncReactiveQbservable { [KnownResource(whereUri)] static IAsyncReactiveQbservable<T> Where<T>(this IAsyncReactiveQbservable<T> xs, Expression<Func<T, bool>> filter) {…} }

43 Enabling Discovery through Metadata
Series of properties for different artifact types Observables, observers, subscriptions, etc. IQueryableDictionary<Uri, TEntity> allows for structured (LINQ) queries Enables tooling “RxMetal” a la “SqlMetal” to generate a derived ClientContext Artifact explorers a la Server Explorer in Visual Studio Enables delegation IRP systems can discover capabilities across boundaries var traffic = ctx.Observables[trafficUri]; var operators = ctx.Observables.Where(kv => kv.Key.StartsWith(“rx://”));

44 Standard Model of Event Processing
What are the reactive artifact types? Division based on temperature Cold = potential energy; to be instantiated Hot = kinetic energy; active artifacts Kinds of artifacts familiar from Rx Cold artifacts can be parameterized Standard Model Cold Operation Hot Observable SubscribeAsync Subscription (Our Higgs boson) CreateAsync Observer Subject factory Subject

45 Demo Quick example of using IRP

46 What’s Next? Starting the conversation on Rx v3.0

47 Rx v3.0 Peeling the onion of our system What, when, where is TBD
Service Host Data model Query engine Operators Peeling the onion of our system Many IRP implementations in Bing But “IRP Core” is… Built in a reusable fashion Strict separation of logical and physical What, when, where is TBD Operator library with checkpointing etc. Service abstractions Data model and data / expression serialization stacks Query engine components Give us feedback!

48 Q&A Thanks!


Download ppt "Cloud-Scale Event Processing Using Rx"

Similar presentations


Ads by Google