Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Analytics Platforms

Similar presentations


Presentation on theme: "Cloud Analytics Platforms"— Presentation transcript:

1 Cloud Analytics Platforms
Christian Frey 22 seconds

2 About me – Christian Frey
Graduated in May with Business and Computer Science Work at AIDA full time now, managing the Institute You can find me at the top floor of Patterson, in the Rural Innovation Centre ~3 minutes

3 BigML – BigML.com Easy to sign up, with an unlimited number of small models (up to 16MB of data) 1-click dataset, 1-click models, and 1-click model evaluation to go from dataset to evaluation in 3 clicks Also offers customization options, with suggestions driven by the data Models are a decision tree, so they build quickly.

4 Iris dataset attributes
sepal length in cm – protection for the flower in bud sepal width in cm – protection for the flower in bud petal length in cm petal width in cm 5. class: -- Iris Setosa -- Iris Versicolour -- Iris Virginica Setosa Versicolour Virginica

5 ` Values Name Description Type Range
Input variables: DWLUN number of dwelling units cont 1-3 RDOS months since last sale cont 0-23 YRBLT year built cont TOTFIXT number of plumbing fixtures cont 5-17 HEATING heating system type desc 2,3 WBFPSTK wood fireplace chimney stack sym yes,no BMNTGAR basement or/and garage cont 0-2 ATTFRGAR attached frame garage area (feet) cont 0-228 TOTLIVAR total living area cont DECK/OFP deck/open porch area cont 0-738 ENCLPOR enclosed porch area cont 0-452 NBHDGRP neighbourhood group desc 1,2 RECROOM recreation room area cont 0-672 FINDSMT finished basement area cont 0-810 GRADE% grade factors cont CDU condition/desirability/usefulness cont 3-5 TOTOBY total other value (building/yard) cont 0-16,400 Response variable: SALEPRIC actual sale price cont $103,00-$250,000

6 Google Cloud Predictions API - cloud.google.com/prediction/
API Explorer on the website, great for prototyping Gives you $300 for 60 days of experimentation Integrates into other Google services, most notably Google Sheets Uses online learning to allow for addition of new data API Explorer allows you to try out the API before you convert it into code, letting you check your API format before implementing code. $300 goes a LONG way with the cloud Predictions. There is a trial, but no free ”Tier”. Once the trial is up, you need to pay. Google sheets integration is pretty cool. You select some row(s) of cells, then some answer row, and it learns the relationship between the two, then auto fills in the missing data. Online learning differs from batch learning in that it allows for data to be fed to the model after it had been trained to refine the model.

7 Google Sheets integration with the cloud prediction API

8 Google Results on Iris Dataset

9 Pros and Cons of Google Cloud Predictions API
Integrates with Google Sheets so you can predict your spreadsheets No choice in the model it uses Very good accuracy on the training set No method of exporting the model that was created Fast training and prediction times, usually under 1 minute to train smaller datasets, half a second to predict

10 Amazon Machine Learning –aws.amazon.com/machine-learning
3 Ways to access your model: Though a web interface Through an API in a variety of languages Through the AWS command line interface No free trial available Picky with the data it accepts – No more than 10k errors in your data, or 10% No choice in model that is used

11

12 Pros and Cons of Amazon Machine Learning
Easy to load data into Amazon S3, then create the model Data must be located in Amazon S3 or Redshift storage, locked into Amazon for everything Model can easily be integrated with other Amazon services No choice of model, it uses variants of regressions for everything. (Linear, Logistic, and Multiclass) Accuracy is slightly lower than other products on the Iris data set

13 Microsoft Azure ML - studio.azureml.net
Drag and drop modules onto an infinite background, then connect Many models to choose from, requires some understanding of the data Offers a web service to access your model Good for those who know about Machine Learning, but don’t want to code Here is the most simple option for a model. Allows you to insert your own custom Python or R code into the flow. Plenty of customization available. You can also fiddle with the parameters that are part of the model. LR, batch size, randomization, # of Trees, etc.

14 Results on the Iris Dataset - Azure

15 Pros and Cons of Azure ML
Lots of machine learning algorithms available, including Neural Networks, Naïve Bayes, Clustering, and Decision Trees It is difficult to find options or the correct module to use Easy to use free trial, no sign up required! Free trial only saves data for 8 hours Allows you to run arbitrary Python or R code as a module to process or analyze your data

16 IBM SPSS Modeler Gold on Cloud
IBM SPSS Modeler – create decision trees, regressions, from an arbitrary dataset Pay as you go – only pay for what you need to use Drag and drop with easy to find modules Auto classifier runs all models and allows you to compare them Tradeoff Analytics – Maximizing multiple values to get the best result based on some constraints. Might have heard it called Linear or non-linear programming. Variable analysis: identifying relevant variables in your data. Runs in Citrix in the cloud for a desktop like feel.

17 SPSS Modeler – Results on IRIS Dataset

18 Pros and Cons of IBM SPSS Modeler Gold on Cloud
Easy to figure out drag and drop interface Cannot run arbitrary Python or R code in cloud version of SPSS Modeler Tied for first place in accuracy Only supports hosted DB2 database connections, no connection to other databases Bulk loading of data into SPSS Modeler from DB2 requires a support ticket


Download ppt "Cloud Analytics Platforms"

Similar presentations


Ads by Google