# From W1-S16. Node failure The probability that at least one node failing is: f= 1 – (1-p) n When n =1; then f =p Suppose p=0.0001 but n=10000, then: f.

## Presentation on theme: "From W1-S16. Node failure The probability that at least one node failing is: f= 1 – (1-p) n When n =1; then f =p Suppose p=0.0001 but n=10000, then: f."— Presentation transcript:

From W1-S16

Node failure The probability that at least one node failing is: f= 1 – (1-p) n When n =1; then f =p Suppose p=0.0001 but n=10000, then: f = 1 – (1 -0.0001) 10000 = 0.63 [why/how ?] This is one of the most important formulas to know (in general). From W2-S9

Example For example suppose the hash functions maps {to, Java, road} to one node. Then – (to,1) remains (to,1) – (Java,1);(Java,1);(Java,1)  (Java, [1,1,1]) – (road,1);(road,1)  (road,[1,1]); Now REDUCE function converts – (Java,[1,1,1])  (Java,3) etc. Remember this is a very simple example…the challenge is to take complex tasks and express them as Map and Reduce! From W2-S15

Similarity Example [2] Notice, it requires some ingenuity to come up with key-value pairs. This is key to suing map-reduce effectively From W2-S19

K-means algorithm Let C = initial k cluster centroids (often selected randomly) Mark C as unstable While Assign all data points to their nearest centroid in C. Compute the centroids of the points assigned to each element of C. Update C as the set of new centroids. Mark C as stable or unstable by comparing with previous set of centroids. End While Complexity: O(nkdI) n:num of points; k: num of clusters; d: dimension; I: num of iterations Take away: complexity is linear in n. From W3-S14

Example: 2 Clusters c c c c c c c c A(-1,2) B(1,2) C(-1,-2) D(1,-2) (0,0) K-means Problem: Solution is (0,2) and (0,-2) and the clusters are {A,B} and {C,D} K-means Algorithm: Suppose the initial centroids are (-1,0) and (1,0) then {A,C} and {B,D} end up as the two clusters. 4 2 From W3-S16

Bayes Rule Prior Posterior From W4-S21

Example: Iris Flower F=Flower; SL=Sepal Length; SW = Sepal Width; PL=Petal Length; PW =Petal Width Data LargeSmallMediumSmall? choose the maximum From W4-S25

Confusion Matrix Actual Label (1)Actual Label (-1) Predicted Label (1)True Positive (N1)False Positive (N2) Predicted Label (-1)False Negatives (N3)True Negatives (N4) Label 1 is called Positive, Label -1 is called Negative Let the number of test samples be N N = N1 + N2 + N3 + N4. True Positive Rate (TPR) = N1/(N1+N3) True Negative Rate (TNR) = N4/(N4+N2) False Positive Rate (FPR) = N2/(N2+N4) False Negative Rate (FNR) = N3/(N1+N3) Accuracy = (N1+N4)/(N1+N2+N3+N4) Precision = N1/(N1+N2)Recall = N1/(N1+N3) From W5-S7

ROC (Receiver Operating Characteristic) Curves Generally a learning algorithm A will return a real number…but what we want is a label {1 or -1} We can apply a threshold..T A0.70.60.50.20.10.090.080.020.01 T=0.111111 True Label 11 11 A0.70.60.50.20.10.090.080.020.01 T=0.21111 True Label 11 11 TPR = 3/4 FPR = 2/5 TPR = 2/4 FPR = 2/5 From W5-S9

Random Variable A random variable X can take values in a set which is: – discrete and finite. Lets toss a coin and X = 1 if it’s a head and X=0 if it’s a tail. X is random variable – discrete and infinite (countable) Let X be the number of accidents in Sydney in a day.. Then X = 0,1,2,….. – Infinite (uncountable) Let X be the height of a Sydney-sider. – X = 150, 150.11,150.112,…… From W5-S13

From W7-S2 These slides are from Steinbach, Pang and Kumar

From W7-S7

From W7-S8

From W9-S9

From W9-S12

From W9-S21

From W9-S26

The Key Idea Decompose the User x Rating matrix into: User x Rating = ( User x Genre ) x (Genre x Movies) – Number of Genres is typically small Or R =~ UV Find U and V such that ||R – UV|| is minimized… – Almost like k-means clustering…why ? From W11-S9

UV Computation…. From W11-S15 This example is from Rajaraman, Leskovic and Ullman: See Textbook

Similar presentations