Presentation is loading. Please wait.

Presentation is loading. Please wait.

SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014.

Similar presentations


Presentation on theme: "SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014."— Presentation transcript:

1 SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014

2 Graphical models and applications CS5412 Spring 2014 (Cloud Computing: Birman) 2  Artificial intelligence and machine learning is the core technology in many modern cloud settings  Support for social networking mechanisms  Creating product placement recommendations  Understanding the flow of “influence” within communities  Graphical processing can also matter in systems  Understand what to cache and what not to cache  Learning common patterns to optimize

3 What makes this hard? CS5412 Spring 2014 (Cloud Computing: Birman) 3  Prior generation of solutions was too general  Programming languages can do anything, but they aren’t at all specialized for graph structured data  Database systems are awesome for tabular data but much less optimized for graphical data  There is also an issue of scale  We’re good at what can be done on one computer  But a company like Facebook has billions of users and their infrastructure runs on massive data centers

4 Today’s papers CS5412 Spring 2014 (Cloud Computing: Birman) 4  TAO paper (I’ll start with this) gives a sense of the challenge Facebook confronts  Like an entire distributed operating system  But the whole role of the solution is to manage graphical data and support queries against it  Massive loads and surreal scale  Things to notice?  How does the architecture of the solution reflect the special environment in which it runs?  How did they identify and optimize the critical paths?

5 Dryad/LINQ CS5412 Spring 2014 (Cloud Computing: Birman) 5  Here we see two concepts combined  At Microsoft, LINQ has become very popular  It embeds a kind of query processing into C# code  Dryad takes this one step further  Given a LINQ expression, Dryad can run it on a distributed “computing engine” of their own design  Idea is to obtain massive parallelism

6 Basic LINQ concepts CS5412 Spring 2014 (Cloud Computing: Birman) 6  LINQ (“language integrated queries”) starts by allowing you to code lambda expressions  In-line functions  Evaluated when the value is needed, not when defined  For example: myPets.Select(a => a.name); myFriends.Select(f => (f.name, f.loc, f.phone.mobile)). Where(f => distance(myloc, f.loc) < 1miles);

7 How Dryad works CS5412 Spring 2014 (Cloud Computing: Birman) 7  Takes a LINQ expression, unevaluated  Maps it to a collection of processor nodes that all have access to the same (read-only, unchanging) data files  This spreads out the work and gains parallelism!

8 Basic architecture of Dryad CS5412 Spring 2014 (Cloud Computing: Birman) 8

9 Execution of a LINQ expression CS5412 Spring 2014 (Cloud Computing: Birman) 9

10 A join, done in two ways CS5412 Spring 2014 (Cloud Computing: Birman) 10

11 A join, done in two ways CS5412 Spring 2014 (Cloud Computing: Birman) 11

12 MapReduce in Dryad/LINQ CS5412 Spring 2014 (Cloud Computing: Birman) 12

13 Beyond Dryad CS5412 Spring 2014 (Cloud Computing: Birman) 13  In follow-on work these guys did something called Naiad…  In that paper, they assert that social networking often comes down to finding fixed points of functions on graphs  For example, “look for poker players who are physically within a mile of me and are friends of me or one of my friends”

14 Social network computations CS5412 Spring 2014 (Cloud Computing: Birman) 14  They believe that most parallel social networking computations can be re-expressed as fixed points  In essence, define a function  (S) for a set S, then iterate until  (S) = S. This is the fixed point.  They want to compute all the fixed points concurrently for some very large community

15 Can we really find use cases? CS5412 Spring 2014 (Cloud Computing: Birman) 15  All the vehicles on Highway 101 need to continuously “watch for the vehicles that could cut me off if they change path”  Define this indirectly too: if truck T changes its trajectory this way, car C might move that way, and then C would cut me off, so include T into the set…  The idea is to do all such computations at once!

16 Naiad and Dryad CS5412 Spring 2014 (Cloud Computing: Birman) 16  Then they map Naiad onto Dryad  First write functions that compute these sets  Next express the fixed-point property over functions  Last, seed the data set and then run Dryad to iterate until all the fixed points are found (or until a time-limit is reached, to cover non-convergent functions)

17 Issue? CS5412 Spring 2014 (Cloud Computing: Birman) 17  By the time Naiad is finished, the style of code is very hard to read, although those who write it find it pretty natural to work this way  In fact many social networking companies do use this style of functional programming (like JaneStreet, famous for using O’CaML for financial analytics)  But is it systems research?

18 Other major systems in this space CS5412 Spring 2014 (Cloud Computing: Birman) 18  Check out http://en.wikipedia.org/wiki/Graph_database http://en.wikipedia.org/wiki/Graph_database  They list 50 or so graphical databases and processing systems  Some popular ones in research settings are Pregel (from Google), GraphLab (CMU) and Vowpal Wabbit (“Fast Learning”) (Yahoo)

19 Take away? CS5412 Spring 2014 (Cloud Computing: Birman) 19  Computer systems need to be responsive to  Styles of use (what our “customers” are doing)  Common patterns of load (optimize for this case)  In today’s major cloud computing settings, graphical data and graphical learning solutions are becoming a highly dominant form of load and focus  Computer systems need to evolve to track this need


Download ppt "SYSTEMS SUPPORT FOR GRAPHICAL LEARNING Ken Birman 1 CS6410 Fall 2014 9/18/2014."

Similar presentations


Ads by Google