Presentation is loading. Please wait.

Presentation is loading. Please wait.

Azure Machine Learning My first Data Science experiment Using Azure Machine Learning.

Similar presentations


Presentation on theme: "Azure Machine Learning My first Data Science experiment Using Azure Machine Learning."— Presentation transcript:

1 Azure Machine Learning My first Data Science experiment Using Azure Machine Learning

2 Our Main Sponsors:

3 Speaker Florian Eiden @fleid_bi / fleid.frfleid.fr Cellenza 156, bd Haussmann 75008 Paris, France http://www.cellenza.com http://blog.cellenza.com

4 For who? BeginnerExperimented Beginner Experimented Machine Learning Azure ML

5 Agenda A quick word on Azure ML then:  Two experimentations  I’m the owner of a flat/condo in Paris and I want to sell it!  I’m in marketing and I want my promotion emails to reach their targets through anti- spam software

6 Azure ML in one schema Business Need Business Value Modeling Deployment HDInsight SQL Server VM SQL DB Blobs & Tables Local Files Excel Files … Cloud Local Storage space IDE for Machine Learning Publication as a web service API Monetization ML Studio API Microsoft Azure Marketplace Web

7 First : Enable your ML Studio In the Azure portal, with an Azure account http://manage.windowsazure.com

8 Azure ML Studio http://studio.azureml.com

9 Before that : my 1st experimentation  I want to sell my flat  Paris, France  2 bedrooms  55 m2  …  But at what price?

10 How to answer that?

11 Surface (m 2 ) Price (€) My flat A fair price!

12 But how to generalize?  Thousands of price points (facts)  Often hundreds of features (dimension attributes)  Surface  Nb of rooms  Storage area  Parking  Exact Location in town  Floor (correlated to the presence of a lift)  Age of the building  Empty or equipped  Distance to metro / public transportation  Distance to shops  … Machine Learning!

13 Machine Learning Building a system that will learn from the existing data, detecting pattern and trends, so that it can predict a continuous value! Supervised Learning >> Regression

14 Linear Regression (1 feature) Surface (m 2 ) Price (€) My flat Market price y = ax + b y : price x : surface

15 My ML System My surface A good price estimate Machine Learning xy y = ax + b Surface (m 2 ) Price(€)

16 My ML System Input : x My surface Output : y An estimate of price h The hypothesis xy y = ax + b Surface (m 2 ) Price(€) y = h(x)

17 Input : x My surface Output : y An estimate of price h The hypothesis y = h(x) x θ0θ0 y = θ 1 x + θ 0 y = h(x) h(x) = h θ (x) = θ 0 + θ 1 x

18 Parameters ranking : Cost Function J(θ i ) : Cost Function Function of thetas, that calculate the total distance between my model and the training set x x y θ0θ0 Model A θ 0 = 1 θ 1 = 0 y = θ 1 x + θ 0 Model B θ 0 = 1 θ 1 = 0,25

19 Parameters ranking : Cost Function J(θ i ) : Cost Function Function of thetas, that calculate the total distance between my model and the training set x x y θ0θ0 Model A θ 0 = 1 θ 1 = 0 y = θ 1 x + θ 0 Model B θ 0 = 1 θ 1 = 0,25 J(θ 0,θ 1 ) = 25 J(θ 0,θ 1 ) = 5

20 The last piece of the puzzle  Training Set  Model type  Cost Error Function  … ? y = h(x) h(x) = h θ (x) = θ 0 + θ 1 x

21 The last piece of the puzzle  Training Set  Model type  Cost Error Function  Optimization Method y = h(x) h(x) = h θ (x) = θ 0 + θ 1 x

22 My ML System Input : x My surface Output : y An estimate of price h The hypothesis xy y = ax + b Surface (m 2 ) Price(€) y = h(x) - Cost Function - Optimization Method

23 Demo 1

24 Variance and Bias http://scott.fortmann-roe.com/docs/BiasVariance.html Underfit Overfit

25 My 2 nd experimentation  As a spammer a marketing professional, how to be sure that my ads high value content optimize the ROI gets maximum viewing on the prospect listings I got from that shady company  In short: I want to know if my messages are going to be flagged as spam or not before I send them

26 Exposing the API to users in Excel SPAM!

27 To get there…

28 Machine Learning Building a system that will learn from the existing data, detecting pattern and trends, so that it can predicts a category! Supervised Learning >> Classification

29 What features for my classification?  1st experimentation : surface, location, floor…  Now?  1 line = 1 message LabelAttribut 0Attribut 1Attribut 2… Spam21 Ham4 31 Spam1 Ham12

30 Intuition Labeloffernewservicerevolutionize… Spam1111

31 The data set  SpamAssassin : 6000+ mails, unstructured text

32 Standard approach Normalization url > #url Email > #email $,£,€ > #devise Removal of numbers, punctuation, stopwords, HTML tags Lower case Length from 3 to 10 max Stemming

33 Generation of the training corpus  6000 mails > N reference words  We keep the top 10’000 by frequence of usage A set of 6000 lines, 10000 columns: LabelWord 0Word 1Word 2Word 3…Word 10000 Spam211 Ham411 31 Spam12 Ham121 … Spam1

34 Implementation in Azure ML  The Hate  No module for normalizing  Has to be done before, in an ETL like data pipeline  The Love  We don’t need to do a full normalization!  Feature Hashing using Vowpal-Wabbit

35 Demo 2  Sources  SpamAssassin : http://spamassassin.apache.org/publiccorpus/http://spamassassin.apache.org/publiccorpus/  Coursera : Machine Learning par Andrew Ng (ex 6 – spam detection with SVM)Machine Learning  Classifying Emails as Spam or Ham using RTextTools, Dennis Lee (blog)blog  AzureML Web Service Scoring with Excel and Power Query, Rui Quintino (blog)blog

36 To go further For everyone 1 month free trial http://azure.com For MSDN subscribers Activate your Azure benefits http://aka.ms/azurepourmsdn Download now Included in almost all Office licences http://www.microsoft.com/en- us/powerBI/support/default.aspx NB : Power BI in Excel is not hosted at PowerBI.com, be aware when you try to download it

37 To go further : the communities sqlpass.org sqlport.com guss.pro


Download ppt "Azure Machine Learning My first Data Science experiment Using Azure Machine Learning."

Similar presentations


Ads by Google