Download presentation
Presentation is loading. Please wait.
Published byHassan Elsmore Modified over 3 years ago
1
From W1-S16
2
Node failure The probability that at least one node failing is: f= 1 – (1-p) n When n =1; then f =p Suppose p=0.0001 but n=10000, then: f = 1 – (1 -0.0001) 10000 = 0.63 [why/how ?] This is one of the most important formulas to know (in general). From W2-S9
3
Example For example suppose the hash functions maps {to, Java, road} to one node. Then – (to,1) remains (to,1) – (Java,1);(Java,1);(Java,1) (Java, [1,1,1]) – (road,1);(road,1) (road,[1,1]); Now REDUCE function converts – (Java,[1,1,1]) (Java,3) etc. Remember this is a very simple example…the challenge is to take complex tasks and express them as Map and Reduce! From W2-S15
4
Similarity Example [2] Notice, it requires some ingenuity to come up with key-value pairs. This is key to suing map-reduce effectively From W2-S19
5
K-means algorithm Let C = initial k cluster centroids (often selected randomly) Mark C as unstable While Assign all data points to their nearest centroid in C. Compute the centroids of the points assigned to each element of C. Update C as the set of new centroids. Mark C as stable or unstable by comparing with previous set of centroids. End While Complexity: O(nkdI) n:num of points; k: num of clusters; d: dimension; I: num of iterations Take away: complexity is linear in n. From W3-S14
6
Example: 2 Clusters c c c c c c c c A(-1,2) B(1,2) C(-1,-2) D(1,-2) (0,0) K-means Problem: Solution is (0,2) and (0,-2) and the clusters are {A,B} and {C,D} K-means Algorithm: Suppose the initial centroids are (-1,0) and (1,0) then {A,C} and {B,D} end up as the two clusters. 4 2 From W3-S16
7
Bayes Rule Prior Posterior From W4-S21
8
Example: Iris Flower F=Flower; SL=Sepal Length; SW = Sepal Width; PL=Petal Length; PW =Petal Width Data LargeSmallMediumSmall? choose the maximum From W4-S25
9
Confusion Matrix Actual Label (1)Actual Label (-1) Predicted Label (1)True Positive (N1)False Positive (N2) Predicted Label (-1)False Negatives (N3)True Negatives (N4) Label 1 is called Positive, Label -1 is called Negative Let the number of test samples be N N = N1 + N2 + N3 + N4. True Positive Rate (TPR) = N1/(N1+N3) True Negative Rate (TNR) = N4/(N4+N2) False Positive Rate (FPR) = N2/(N2+N4) False Negative Rate (FNR) = N3/(N1+N3) Accuracy = (N1+N4)/(N1+N2+N3+N4) Precision = N1/(N1+N2)Recall = N1/(N1+N3) From W5-S7
10
ROC (Receiver Operating Characteristic) Curves Generally a learning algorithm A will return a real number…but what we want is a label {1 or -1} We can apply a threshold..T A0.70.60.50.20.10.090.080.020.01 T=0.111111 True Label 11 11 A0.70.60.50.20.10.090.080.020.01 T=0.21111 True Label 11 11 TPR = 3/4 FPR = 2/5 TPR = 2/4 FPR = 2/5 From W5-S9
11
Random Variable A random variable X can take values in a set which is: – discrete and finite. Lets toss a coin and X = 1 if it’s a head and X=0 if it’s a tail. X is random variable – discrete and infinite (countable) Let X be the number of accidents in Sydney in a day.. Then X = 0,1,2,….. – Infinite (uncountable) Let X be the height of a Sydney-sider. – X = 150, 150.11,150.112,…… From W5-S13
12
From W7-S2 These slides are from Steinbach, Pang and Kumar
13
From W7-S7
14
From W7-S8
15
From W9-S9
16
From W9-S12
17
From W9-S21
18
From W9-S26
19
The Key Idea Decompose the User x Rating matrix into: User x Rating = ( User x Genre ) x (Genre x Movies) – Number of Genres is typically small Or R =~ UV Find U and V such that ||R – UV|| is minimized… – Almost like k-means clustering…why ? From W11-S9
20
UV Computation…. From W11-S15 This example is from Rajaraman, Leskovic and Ullman: See Textbook
Similar presentations
© 2018 SlidePlayer.com Inc.
All rights reserved.
Ppt on cloud gaming Visuals ppt on tenses Free ppt on rainwater harvesting Ppt on higher education system in india Ppt on famous indian forts Free ppt on moving coil galvanometer experiment Ppt on nepali culture show Ppt on formal education is a required Ppt on condition based maintenance programs Ppt on 2d viewing pipeline