Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.

Similar presentations


Presentation on theme: "Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart."— Presentation transcript:

1 Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore, Percy Liang, Luke Zettlemoyer, Rob Pless, Killian Weinberger, Deva Ramanan 1

2 Announcements HW3 due Oct 29, 11:59pm

3 Machine learning Image source: https://www.coursera.org/course/mlhttps://www.coursera.org/course/ml

4 Machine learning Definition –Getting a computer to do well on a task without explicitly programming it –Improving performance on a task based on experience

5 Big Data!

6 What is machine learning? Computer programs that can learn from data Two key components –Representation: how should we represent the data? –Generalization: the system should generalize from its past experience (observed data items) to perform well on unseen data items.

7 Types of ML algorithms Unsupervised –Algorithms operate on unlabeled examples Supervised –Algorithms operate on labeled examples Semi/Partially-supervised –Algorithms combine both labeled and unlabeled examples

8

9 Clustering –The assignment of objects into groups (aka clusters) so that objects in the same cluster are more similar to each other than objects in different clusters. –Clustering is a common technique for statistical data analysis, used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics.

10 Euclidean distance, angle between data vectors, etc

11

12 K-means clustering Want to minimize sum of squared Euclidean distances between points x i and their nearest cluster centers m k

13

14

15

16

17

18

19

20

21

22

23

24 K-means x’s – indicate initialization for 3 cluster centers Iterate until convergence: 1) Compute assignment of data points to cluster centers 2) Update cluster centers with mean of assigned points

25

26

27 Flat vs Hierarchical Clustering Flat algorithms –Usually start with a random partitioning of docs into groups –Refine iteratively –Main algorithm: k-means Hierarchical algorithms –Create a hierarchy –Bottom-up: agglomerative –Top-down: divisive

28 Hierarchical clustering strategies Agglomerative clustering Start with each data point in a separate cluster At each iteration, merge two of the “closest” clusters Divisive clustering Start with all data points grouped into a single cluster At each iteration, split the “largest” cluster

29 P Produces a hierarchy of clusterings P P P

30 P

31 Divisive Clustering Top-down (instead of bottom-up as in Agglomerative Clustering) Start with all data points in one big cluster Then recursively split clusters Eventually each data point forms a cluster on its own.

32 Flat or hierarchical clustering? For high efficiency, use flat clustering (e.g. k means) For deterministic results: hierarchical clustering When a hierarchical structure is desired: hierarchical algorithm Hierarchical clustering can also be applied if K cannot be predetermined (can start without knowing K)

33 Clustering in Action – example from computer vision

34 Recall: Bag of Words Representation  Represent document as a “bag of words”

35 Bag of features for images  Represent images as a “bag of words”

36 Bags of features for image classification 1.Extract features

37 2.Learn “visual vocabulary” Bags of features for image classification

38 1.Extract features 2.Learn “visual vocabulary” 3.Represent images by frequencies of “visual words” Bags of features for image classification

39 … 1. Feature extraction

40 2. Learning the visual vocabulary …

41 Clustering …

42 2. Learning the visual vocabulary Clustering … Visual vocabulary

43 Example visual vocabulary Fei-Fei et al. 2005

44 3. Image representation ….. frequency Visual words

45 Types of ML algorithms Unsupervised –Algorithms operate on unlabeled examples Supervised –Algorithms operate on labeled examples Semi/Partially-supervised –Algorithms combine both labeled and unlabeled examples

46

47

48

49

50 Example: Sentiment analysis http://gigaom.com/2013/10/03/stanford-researchers-to-open-source-model-they-say-has-nailed-sentiment-analysis/ http://nlp.stanford.edu:8080/sentiment/rntnDemo.html

51 Example: Image classification apple pear tomato cow dog horse inputdesired output

52 http://yann.lecun.com/exdb/mnist/index.html

53 Example: Seismic data Body wave magnitude Surface wave magnitude Nuclear explosions Earthquakes

54

55 The basic classification framework y = f(x) Learning: given a training set of labeled examples {(x 1,y 1 ), …, (x N,y N )}, estimate the parameters of the prediction function f Inference: apply f to a never before seen test example x and output the predicted value y = f(x) outputclassification function input

56 Naïve Bayes Classification The class that maximizes:

57 Example: Image classification Car Input: Image Representation Classifier (e.g. Naïve Bayes, Neural Net, etc Output: Predicted label

58 Example: Training and testing Key challenge: generalization to unseen examples Training set (labels known)Test set (labels unknown)

59

60 Some classification methods 10 6 examples Nearest neighbor Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005 … Neural networks LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 … Support Vector Machines and Kernels Conditional Random Fields McCallum, Freitag, Pereira 2000 Kumar, Hebert 2003 … Guyon, Vapnik Heisele, Serre, Poggio, 2001 …

61 Classification … more soon

62 Types of ML algorithms Unsupervised –Algorithms operate on unlabeled examples Supervised –Algorithms operate on labeled examples Semi/Partially-supervised –Algorithms combine both labeled and unlabeled examples

63 Supervised learning has many successes recognize speech, steer a car, classify documents classify proteins recognizing faces, objects in images... Slide Credit: Avrim Blum

64 However, for many problems, labeled data can be rare or expensive. Unlabeled data is much cheaper. Need to pay someone to do it, requires special testing,… Slide Credit: Avrim Blum

65 However, for many problems, labeled data can be rare or expensive. Unlabeled data is much cheaper. Speech Images Medical outcomes Customer modeling Protein sequences Web pages Need to pay someone to do it, requires special testing,… Slide Credit: Avrim Blum

66 However, for many problems, labeled data can be rare or expensive. Unlabeled data is much cheaper. [From Jerry Zhu] Need to pay someone to do it, requires special testing,… Slide Credit: Avrim Blum

67 Need to pay someone to do it, requires special testing,… However, for many problems, labeled data can be rare or expensive. Unlabeled data is much cheaper. Can we make use of cheap unlabeled data? Slide Credit: Avrim Blum

68 Semi-Supervised Learning Can we use unlabeled data to augment a small labeled sample to improve learning? But unlabeled data is missing the most important info!! But maybe still has useful regularities that we can use. But… Slide Credit: Avrim Blum


Download ppt "Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart."

Similar presentations


Ads by Google