Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem.

Similar presentations


Presentation on theme: "Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem."— Presentation transcript:

1 Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem

2 Example: Spam Filter Spam message: unwanted email message –Dozens or even hundreds per day Goal: Automatically distinguish between spam and non-spam email messages

3 Spam message 1

4 Spam message 2

5 Spam message 3

6 Spam message 4

7 How to Distinguish ? Message contents ? –Automatic semantic analysis is yet to be solved Message sender ? –What about unfamiliar senders or fake senders ? Collection of keywords ? Message Length ? Mail server ? Time of delivery ?

8 How to Distinguish ? It’s hard to define an explicit set of rules to distinguish between spam and non-spam Learn the concept of “spam” from examples !

9 Example: Gender Classification

10 The Power of Learning: Real Life example How much time does it take you to get to work ? –First approach: Analyze your route Distance, traffic lights, traffic, etc… Can be quite complicated… –Second Approach: how much time does it usually take ? Despite of some variance, works remarkably well! Requires “training” for different times May fail in special cases

11 Machine Translation

12 Collaborative Filtering Collaborative Filtering: Prediction of user ratings based on the ratings of other users Examples: –Movie ratings –Product recommendation Is this of merely theoretical interest ??

13 Netflix Prize Over 100 million ratings from 480 thousand customers over 17000 movie titles (sparsity: 0.0123)

14 Recommendation system

15 Machine Learning Applications Search Engines Collaborative Filtering (Netflix, Amazon) Face, speech and pattern Recognition Machine Translation Natural language processing Medical diagnosis and treatment Bioinformatics Computer games Many more !

16 Generalization: Train vs. Test The central assumption we make is that the train set and the new examples are “similar” Formally, the assumption is that samples are drawn from the same distribution Is this assumption realistic ?

17 Train vs. Test: Might Fail to Generalize

18 Acquiring a good train set Have a huge train set –Train data might be available on the web –Use humans to collect data –Collect results (or aggregations thereof) of user actions Unsupervised methods – require only raw data, no need for labels !

19 Machine Learning Strategies Discriminative Approach –Feature selection: find the features that carry the most information for separation Generative Approach –Model the data using a generative process –Estimate the parameters of the model

20 Supervised vs. Unsupervised Supervised Machine Learning –Classification (learning) –Collection of large representative train set might not be simple Unsupervised Machine Learning –Clustering The number of clusters may be known or unknown –Usually plenty of train data is available

21 Discriminative Learning Data representation and Feature selection: What is relevant for classification ? –Gender classification: hair, ears, make up, beard, moustache, etc. Linear Separation –SVM, Fisher LDA, Perceptron and more –Different criteria for separation – what would generalize well ? –Non-linear separation

22 Linear Separation

23 Nonlinear Separation (Kernel Trick)

24 Generative Approach Model the observations using a generative process The generative process induces a distribution over the observations Learn a set of parameters

25 Statistical Approach – Real Life Example You’re stuck in traffic. Which Lane is faster? The complicated approach: –Consider the traffic, trucks, merging lanes, etc. The statistical (Bayesian) Approach: –Which lane is usually faster ? (prior) –What are you seeing ? (evidence)

26 Summary Machine Learning: Learn a concept from examples For good generalization, train data has to faithfully represent test data Many potential applications Already in use and works remarkably well


Download ppt "Machine Learning Tutorial Amit Gruber The Hebrew University of Jerusalem."

Similar presentations


Ads by Google