Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prescriptive Analytics Part I Nick Gonzalez, 2/10/14.

Similar presentations


Presentation on theme: "Prescriptive Analytics Part I Nick Gonzalez, 2/10/14."— Presentation transcript:

1 Prescriptive Analytics Part I Nick Gonzalez, 2/10/14

2 -Isaac Asimov “It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be.”

3 Topics Covered Reference automated prescriptive analytics system Automated algorithm selection Distributed algorithm development

4 Covered in future presentations Ontology creation and extraction Representing solutions using ontologies Business optimization everything else…

5 Today’s Data Landscape

6 Tomorrow’s Data Landscape

7 Data is outpacing us Humans can not keep up Computers can but…

8 Prescriptive Analytics Scalable Automated understanding Automated predictive analytics Actionable Closed loop

9 Example. Video Games game metrics learning process predictive models deploy gameserver rules simulations writ e start understandin g build / update models modif y copy to production generat e user space analytics space

10 Problems Scale Speed Adaptability

11 Automated Learning

12 - Isaac Asimov “I do not fear computers. I fear the lack of them.”

13 Goals Remove the human element from analysis phases Generate accurate, actionable, predictive models Combine predictive models and simulation to solve problems

14 Guiding Principle Big data with simple algorithms will out perform sampled data with complex algorithms.

15 How is this possible? Focus on a single problem. Limit scope Goal must be Measurable Actionable

16 Process Data Data Engineering & Understanding Modeling Prep Simulation Actionable Deployment

17 1. Automated Understanding Find the data representation that is most ideal for the problem you are trying to solve.

18 Automated Understanding Raw Data Clean Data Initial Transform Stats meta

19 Automated Understanding Clean Data Stats meta Representation A Representation B Representation C A.1 … A.2 …

20 2. Automated Algorithm Selection Find the algorithm that performs best against the problem you are trying to solve, while meeting all criteria.

21 Automated Algorithm Selection Choose algorithms best suited for this type of problem. Consider the data, types, sparsity, size, and desired outcome Try multiple algorithms Calculate the Root Mean Squared Error or some other appropriate measure. Consider problem domain. Use cross validation. Do not just compare the average RMSE Choose the algorithm(s) that perform the best

22 Distributed Processing Learning to Scale

23 Approaching the Problem Two ways to approach a problem Bottom up Top down

24 Bottom Up Approach Hardware Assembly Language C, Pascal C++, Java Design Patterns, Algorithms Programmer

25 Top Down Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

26 Building Distributed Algorithms Identify the simplest concepts that describe data processing Collections Collection processing Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

27 Single “Box” Evolution of thought Data Data AlgorithmAlgorithm DataData Collection Collection Processing No “Box”

28 Coming together map mapcatreduce filtersortgroup HadoopSinglePCMPI… k-means densityrandomforestgradientboost ….

29 Distributed Processing Interface Simple concept Focus on building algorithms Many ways to implement this concept Works with both shared memory systems and distributed memory systems

30 Implementation Functional language - Clojure Reusable functions as callbacks Hadoop drivers written on top of Cascalog Data location and type are abstracted as “collection”

31

32 - Isaac Asimov “Part of the inhumanity of the computer is that once it is completely programmed and working smoothly, it is completely honest.”


Download ppt "Prescriptive Analytics Part I Nick Gonzalez, 2/10/14."

Similar presentations


Ads by Google