Download presentation

Presentation is loading. Please wait.

Published byDaryl Clifford Modified about 1 year ago

1
Prescriptive Analytics Part I Nick Gonzalez, 2/10/14

2
-Isaac Asimov “It is change, continuing change, inevitable change, that is the dominant factor in society today. No sensible decision can be made any longer without taking into account not only the world as it is, but the world as it will be.”

3
Topics Covered Reference automated prescriptive analytics system Automated algorithm selection Distributed algorithm development

4
Covered in future presentations Ontology creation and extraction Representing solutions using ontologies Business optimization everything else…

5
Today’s Data Landscape

6
Tomorrow’s Data Landscape

7
Data is outpacing us Humans can not keep up Computers can but…

8
Prescriptive Analytics Scalable Automated understanding Automated predictive analytics Actionable Closed loop

9
Example. Video Games game metrics learning process predictive models deploy gameserver rules simulations writ e start understandin g build / update models modif y copy to production generat e user space analytics space

10
Problems Scale Speed Adaptability

11
Automated Learning

12
- Isaac Asimov “I do not fear computers. I fear the lack of them.”

13
Goals Remove the human element from analysis phases Generate accurate, actionable, predictive models Combine predictive models and simulation to solve problems

14
Guiding Principle Big data with simple algorithms will out perform sampled data with complex algorithms.

15
How is this possible? Focus on a single problem. Limit scope Goal must be Measurable Actionable

16
Process Data Data Engineering & Understanding Modeling Prep Simulation Actionable Deployment

17
1. Automated Understanding Find the data representation that is most ideal for the problem you are trying to solve.

18
Automated Understanding Raw Data Clean Data Initial Transform Stats meta

19
Automated Understanding Clean Data Stats meta Representation A Representation B Representation C A.1 … A.2 …

20
2. Automated Algorithm Selection Find the algorithm that performs best against the problem you are trying to solve, while meeting all criteria.

21
Automated Algorithm Selection Choose algorithms best suited for this type of problem. Consider the data, types, sparsity, size, and desired outcome Try multiple algorithms Calculate the Root Mean Squared Error or some other appropriate measure. Consider problem domain. Use cross validation. Do not just compare the average RMSE Choose the algorithm(s) that perform the best

22
Distributed Processing Learning to Scale

23
Approaching the Problem Two ways to approach a problem Bottom up Top down

24
Bottom Up Approach Hardware Assembly Language C, Pascal C++, Java Design Patterns, Algorithms Programmer

25
Top Down Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

26
Building Distributed Algorithms Identify the simplest concepts that describe data processing Collections Collection processing Problem Solver Problem Representation Distributed System Abstractions Functional Languages Hardware

27
Single “Box” Evolution of thought Data Data AlgorithmAlgorithm DataData Collection Collection Processing No “Box”

28
Coming together map mapcatreduce filtersortgroup HadoopSinglePCMPI… k-means densityrandomforestgradientboost ….

29
Distributed Processing Interface Simple concept Focus on building algorithms Many ways to implement this concept Works with both shared memory systems and distributed memory systems

30
Implementation Functional language - Clojure Reusable functions as callbacks Hadoop drivers written on top of Cascalog Data location and type are abstracted as “collection”

31

32
- Isaac Asimov “Part of the inhumanity of the computer is that once it is completely programmed and working smoothly, it is completely honest.”

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google