Presentation is loading. Please wait.

Presentation is loading. Please wait.

StatQuest: t-SNE Clearly Expalined!!!!

Similar presentations


Presentation on theme: "StatQuest: t-SNE Clearly Expalined!!!!"— Presentation transcript:

1 StatQuest: t-SNE Clearly Expalined!!!!
Copyright 2017 Joshua Starmer,

2 Copyright 2017 Joshua Starmer, https://statquest.org

3 Copyright 2017 Joshua Starmer, https://statquest.org
Here’s a basic 2-D scatter plot. Copyright 2017 Joshua Starmer,

4 Copyright 2017 Joshua Starmer, https://statquest.org
Here’s a basic 2-D scatter plot. Let’s do a walk through of how t-SNE would transform this graph… Copyright 2017 Joshua Starmer,

5 Copyright 2017 Joshua Starmer, https://statquest.org
Here’s a basic 2-D scatter plot. Let’s do a walk through of how t-SNE would transform this graph… …into a flat, 1-D plot on a number line. Copyright 2017 Joshua Starmer,

6 Copyright 2017 Joshua Starmer, https://statquest.org
NOTE: If we just projected the data onto one of the axes, we’d just get a big mess that doesn’t preserve the original clustering. Copyright 2017 Joshua Starmer,

7 Copyright 2017 Joshua Starmer, https://statquest.org
Instead of two distinct clusters, we just see a mishmash. Copyright 2017 Joshua Starmer,

8 Copyright 2017 Joshua Starmer, https://statquest.org
Same here... Copyright 2017 Joshua Starmer,

9 Copyright 2017 Joshua Starmer, https://statquest.org
What t-SNE does is find a way to project data into a low dimensional space (in this case, the 1-D number line) so that the clustering in the high dimensional space (in this case, the 2-D scatter plot) is preserved. Copyright 2017 Joshua Starmer,

10 So let’s step through the basic ideas of how t-SNE does this.
Copyright 2017 Joshua Starmer,

11 Copyright 2017 Joshua Starmer, https://statquest.org
We’ll start start with the original scatter plot… Copyright 2017 Joshua Starmer,

12 Copyright 2017 Joshua Starmer, https://statquest.org
We’ll start start with the original scatter plot… … then we’ll put the points on the number line in a random order. Copyright 2017 Joshua Starmer,

13 Copyright 2017 Joshua Starmer, https://statquest.org
From here on out, t-SNE moves these points, a little bit at a time, until it has clustered them. Copyright 2017 Joshua Starmer,

14 Copyright 2017 Joshua Starmer, https://statquest.org
Let’s figure out where to move this first point… Copyright 2017 Joshua Starmer,

15 Copyright 2017 Joshua Starmer, https://statquest.org
Let’s figure out where to move this first point… Should it move a little to the left or a little to the right? Copyright 2017 Joshua Starmer,

16 Copyright 2017 Joshua Starmer, https://statquest.org
Because it is part of this cluster… Copyright 2017 Joshua Starmer,

17 Copyright 2017 Joshua Starmer, https://statquest.org
Because it is part of this cluster… …it wants to move closer to these points. Copyright 2017 Joshua Starmer,

18 Copyright 2017 Joshua Starmer, https://statquest.org
But at the same time, these points… Copyright 2017 Joshua Starmer,

19 Copyright 2017 Joshua Starmer, https://statquest.org
But at the same time, these points… …are far away in the scatter plot… Copyright 2017 Joshua Starmer,

20 Copyright 2017 Joshua Starmer, https://statquest.org
But at the same time, these points… …are far away in the scatter plot… …so they push back. Copyright 2017 Joshua Starmer,

21 Copyright 2017 Joshua Starmer, https://statquest.org
These points attract… Copyright 2017 Joshua Starmer,

22 Copyright 2017 Joshua Starmer, https://statquest.org
These points attract… …while these points repel. Copyright 2017 Joshua Starmer,

23 Copyright 2017 Joshua Starmer, https://statquest.org
In this case, the attraction is strongest, so the point moves a little to the right. Copyright 2017 Joshua Starmer,

24 Copyright 2017 Joshua Starmer, https://statquest.org
In this case, the attraction is strongest, so the point moves a little to the right. Copyright 2017 Joshua Starmer,

25 Copyright 2017 Joshua Starmer, https://statquest.org
BAM! Copyright 2017 Joshua Starmer,

26 Copyright 2017 Joshua Starmer, https://statquest.org
Now let’s move this point a little bit… Copyright 2017 Joshua Starmer,

27 Copyright 2017 Joshua Starmer, https://statquest.org
These points attract… Copyright 2017 Joshua Starmer,

28 Copyright 2017 Joshua Starmer, https://statquest.org
These points attract… …and this point repels a little bit. Copyright 2017 Joshua Starmer,

29 Copyright 2017 Joshua Starmer, https://statquest.org
So it moves a little to closer to the other orange points. Copyright 2017 Joshua Starmer,

30 Copyright 2017 Joshua Starmer, https://statquest.org
Double BAM! Copyright 2017 Joshua Starmer,

31 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

32 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

33 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

34 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

35 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

36 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

37 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

38 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

39 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

40 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

41 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

42 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

43 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

44 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

45 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

46 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

47 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

48 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

49 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

50 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

51 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

52 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

53 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

54 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

55 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

56 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

57 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

58 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

59 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

60 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

61 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

62 Copyright 2017 Joshua Starmer, https://statquest.org
At each step, a point on the line is attracted to points it is near in the scatter plot, and repelled by points it is far from… Copyright 2017 Joshua Starmer,

63 Copyright 2017 Joshua Starmer, https://statquest.org
Triple BAM!!!!! Copyright 2017 Joshua Starmer,

64 Copyright 2017 Joshua Starmer, https://statquest.org
Now that we’ve seen the what t-SNE tries to do, let’s dive into the nitty-gritty details of how it does what it does. Copyright 2017 Joshua Starmer,

65 Copyright 2017 Joshua Starmer, https://statquest.org
Step 1: Determine the “similarity” of all the points in the scatter plot. Copyright 2017 Joshua Starmer,

66 Copyright 2017 Joshua Starmer, https://statquest.org
Step 1: Determine the “similarity” of all the points in the scatter plot. For this example, let’s focus on determining the similarities between this point and all of the other points. Copyright 2017 Joshua Starmer,

67 Copyright 2017 Joshua Starmer, https://statquest.org
First, measure the distance between two points… Copyright 2017 Joshua Starmer,

68 Copyright 2017 Joshua Starmer, https://statquest.org
First, measure the distance between two points… Then plot that distance on a normal curve that is centered on the point of interest… Copyright 2017 Joshua Starmer,

69 Copyright 2017 Joshua Starmer, https://statquest.org
First, measure the distance between two points… Then plot that distance on a normal curve that is centered on the point of interest… …lastly, draw a line from the point to the curve. The length of that line is the “unscaled similarity”. Copyright 2017 Joshua Starmer,

70 Copyright 2017 Joshua Starmer, https://statquest.org
First, measure the distance between two points… Then plot that distance on a normal curve that is centered on the point of interest… …lastly, draw a line from the point to the curve. The length of that line is the “unscaled similarity”. (I made that terminology up, but it will make sense in just a bit!) Copyright 2017 Joshua Starmer,

71 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

72 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

73 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

74 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

75 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

76 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Copyright 2017 Joshua Starmer,

77 Copyright 2017 Joshua Starmer, https://statquest.org
Now we calculate the “unscaled similarity” for this pair of points. Etc. etc… Copyright 2017 Joshua Starmer,

78 Copyright 2017 Joshua Starmer, https://statquest.org
Using a normal distribution means that distant points have very low similarity values…. Copyright 2017 Joshua Starmer,

79 Copyright 2017 Joshua Starmer, https://statquest.org
… and close points have high similarity values. Copyright 2017 Joshua Starmer,

80 Copyright 2017 Joshua Starmer, https://statquest.org
Ultimately, we measure the distances between all of the points and the point of interest… Copyright 2017 Joshua Starmer,

81 Copyright 2017 Joshua Starmer, https://statquest.org
Ultimately, we measure the distances between all of the points and the point of interest… Plot them on the normal curve… Copyright 2017 Joshua Starmer,

82 Copyright 2017 Joshua Starmer, https://statquest.org
Ultimately, we measure the distances between all of the points and the point of interest… Plot them on the normal curve… …and then measure the distances from the points to the curve to get the unscaled similarity scores with respect to the point of interest. Copyright 2017 Joshua Starmer,

83 Copyright 2017 Joshua Starmer, https://statquest.org
The next step is to scale the unscaled similarities so that they add up to 1. Copyright 2017 Joshua Starmer,

84 Umm… Why do the similarity scores need to add up to 1?
The next step is to scale the unscaled similarities so that they add up to 1. Umm… Why do the similarity scores need to add up to 1? Copyright 2017 Joshua Starmer,

85 Copyright 2017 Joshua Starmer, https://statquest.org
It has to do with something I didn’t tell you earlier… Copyright 2017 Joshua Starmer,

86 Copyright 2017 Joshua Starmer, https://statquest.org
It has to do with something I didn’t tell you earlier… …and to illustrate the concept, I need to add a cluster that is half as dense as the others. Copyright 2017 Joshua Starmer,

87 Copyright 2017 Joshua Starmer, https://statquest.org
The width of the normal curve depends on the density of data near the point of interest. Copyright 2017 Joshua Starmer,

88 Copyright 2017 Joshua Starmer, https://statquest.org
The width of the normal curve depends on the density of data near the point of interest. Less dense regions have wider curves. Copyright 2017 Joshua Starmer,

89 Copyright 2017 Joshua Starmer, https://statquest.org
…so if these points… have half the density as these points… Copyright 2017 Joshua Starmer,

90 Copyright 2017 Joshua Starmer, https://statquest.org
…so if these points… have half the density as these points… Copyright 2017 Joshua Starmer,

91 Copyright 2017 Joshua Starmer, https://statquest.org
…so if these points… have half the density as these points… …and this curve… is half as wide as this curve… Copyright 2017 Joshua Starmer,

92 Copyright 2017 Joshua Starmer, https://statquest.org
…so if these points… have half the density as these points… …and this curve… is half as wide as this curve… Copyright 2017 Joshua Starmer,

93 Copyright 2017 Joshua Starmer, https://statquest.org
…so if these points… have half the density as these points… …and this curve… is half as wide as this curve… …then scaling the similarity scores will make them the same for both clusters. Copyright 2017 Joshua Starmer,

94 Copyright 2017 Joshua Starmer, https://statquest.org
Here’s an example… Copyright 2017 Joshua Starmer,

95 Copyright 2017 Joshua Starmer, https://statquest.org
Here’s an example… This curve has a standard deviation = 1. Copyright 2017 Joshua Starmer,

96 This curve has a standard deviation = 1.
Here’s an example… = 0.24 This curve has a standard deviation = 1. The “unscaled” similarity values = 0.05 Copyright 2017 Joshua Starmer,

97 This curve has a standard deviation = 1.
Here’s an example… = 0.24 This curve has a standard deviation = 1. The “unscaled” similarity values = 0.05 This curve has a standard deviation = 2. Copyright 2017 Joshua Starmer,

98 This curve has a standard deviation = 1.
Here’s an example… = 0.24 This curve has a standard deviation = 1. The “unscaled” similarity values = 0.05 These points are twice as far from the middle. This curve has a standard deviation = 2. Copyright 2017 Joshua Starmer,

99 This curve has a standard deviation = 1.
Here’s an example… = 0.24 This curve has a standard deviation = 1. The “unscaled” similarity values = 0.05 These points are twice as far from the middle. The “unscaled” similarity values are half of the other ones. This curve has a standard deviation = 2. = 0.12 = 0.024 Copyright 2017 Joshua Starmer,

100 Copyright 2017 Joshua Starmer, https://statquest.org
To scale the similarity scores so they sum to 1: = 0.24 Score Sum of all scores = Scaled Score = 0.05 = 0.12 = 0.024 Copyright 2017 Joshua Starmer,

101 Copyright 2017 Joshua Starmer, https://statquest.org
To scale the similarity scores so they sum to 1: 0.24 = 0.82 0.05 = 0.18 = 0.24 Score Sum of all scores = Scaled Score = 0.05 = 0.12 = 0.024 Copyright 2017 Joshua Starmer,

102 Copyright 2017 Joshua Starmer, https://statquest.org
To scale the similarity scores so they sum to 1: 0.24 = 0.82 0.05 = 0.18 = 0.24 Score Sum of all scores = Scaled Score = 0.05 0.12 = 0.82 0.024 = 0.18 = 0.12 = 0.024 Copyright 2017 Joshua Starmer,

103 To scale the similarity scores so they sum to 1: 0.24
= 0.82 0.05 = 0.18 = 0.24 Score Sum of all scores = Scaled Score = 0.05 These are the same as these! 0.12 = 0.82 0.024 = 0.18 = 0.12 = 0.024 Copyright 2017 Joshua Starmer,

104 Copyright 2017 Joshua Starmer, https://statquest.org
That implies that the scaled similarity scores for this relatively tight cluster… 0.24 = 0.82 0.05 = 0.18 0.12 = 0.82 0.024 = 0.18 Copyright 2017 Joshua Starmer,

105 Copyright 2017 Joshua Starmer, https://statquest.org
That implies that the scaled similarity scores for this relatively tight cluster… 0.24 = 0.82 0.05 = 0.18 …are the same for this relatively loose cluster! 0.12 = 0.82 0.024 = 0.18 Copyright 2017 Joshua Starmer,

106 The reality is a little more complicated, but only slightly.
Copyright 2017 Joshua Starmer,

107 The reality is a little more complicated, but only slightly.
t-SNE has a “perplexity” parameter equal to the expected density, and that comes into play, but these clusters are still more “similar” than you might expect. Copyright 2017 Joshua Starmer,

108 Copyright 2017 Joshua Starmer, https://statquest.org
Now back to the original scatter plot… Copyright 2017 Joshua Starmer,

109 Copyright 2017 Joshua Starmer, https://statquest.org
We’ve calculated similarity scores for this point. Copyright 2017 Joshua Starmer,

110 Copyright 2017 Joshua Starmer, https://statquest.org
Now we do it for this point… Copyright 2017 Joshua Starmer,

111 Copyright 2017 Joshua Starmer, https://statquest.org
Now we do it for this point… …and we do it for all the points. Copyright 2017 Joshua Starmer,

112 Copyright 2017 Joshua Starmer, https://statquest.org
One last thing and the scatter plot will be all set with similarity scores!!! Copyright 2017 Joshua Starmer,

113 Copyright 2017 Joshua Starmer, https://statquest.org
Because the width of the distribution is based on the density of the surrounding data points, the similarity score to this node… Copyright 2017 Joshua Starmer,

114 Copyright 2017 Joshua Starmer, https://statquest.org
…might not be the same as the similarity to this node. Copyright 2017 Joshua Starmer,

115 Copyright 2017 Joshua Starmer, https://statquest.org
So t-SNE just averages the two similarity scores from the two directions… Copyright 2017 Joshua Starmer,

116 Copyright 2017 Joshua Starmer, https://statquest.org
So t-SNE just averages the two similarity scores from the two directions… No big deal! Copyright 2017 Joshua Starmer,

117 Ultimately, you end up with a matrix of similarity scores.
= High similarity = Low similarity Ultimately, you end up with a matrix of similarity scores. Copyright 2017 Joshua Starmer,

118 Copyright 2017 Joshua Starmer, https://statquest.org
= High similarity = Low similarity Copyright 2017 Joshua Starmer,

119 Copyright 2017 Joshua Starmer, https://statquest.org
= High similarity = Low similarity Hooray!!! We’re done doing calculating similarity scores for the scatter plot! Copyright 2017 Joshua Starmer,

120 Copyright 2017 Joshua Starmer, https://statquest.org
Now we randomly project the data onto the number line… Copyright 2017 Joshua Starmer,

121 Copyright 2017 Joshua Starmer, https://statquest.org
Now we randomly project the data onto the number line… … and calculate similarity scores for the points on the number line. Copyright 2017 Joshua Starmer,

122 Copyright 2017 Joshua Starmer, https://statquest.org
Just like before, that means picking a point… Copyright 2017 Joshua Starmer,

123 Copyright 2017 Joshua Starmer, https://statquest.org
Just like before, that means picking a point… …measuring a distance… Copyright 2017 Joshua Starmer,

124 Copyright 2017 Joshua Starmer, https://statquest.org
…and lastly, drawing a line from the point to a curve. However, this time we’re using a “t-distribution”. Just like before, that means picking a point… …measuring a distance… Copyright 2017 Joshua Starmer,

125 Copyright 2017 Joshua Starmer, https://statquest.org
A “t-distribution”… Copyright 2017 Joshua Starmer,

126 Copyright 2017 Joshua Starmer, https://statquest.org
A “t-distribution”… …is a lot like a normal distribution Copyright 2017 Joshua Starmer,

127 Copyright 2017 Joshua Starmer, https://statquest.org
A “t-distribution”… …is a lot like a normal distribution… …except the “t” isn’t as tall in the middle… Copyright 2017 Joshua Starmer,

128 Copyright 2017 Joshua Starmer, https://statquest.org
A “t-distribution”… …is a lot like a normal distribution… …except the “t” isn’t as tall in the middle… … and the tails are taller on the ends. Copyright 2017 Joshua Starmer,

129 …is a lot like a normal distribution…
A “t-distribution”… …is a lot like a normal distribution… …except the “t” isn’t as tall in the middle… … and the tails are taller on the ends. The “t-distribution” is the “t” in t-SNE. Copyright 2017 Joshua Starmer,

130 …is a lot like a normal distribution…
A “t-distribution”… …is a lot like a normal distribution… …except the “t” isn’t as tall in the middle… … and the tails are taller on the ends. The “t-distribution” is the “t” in t-SNE. We’ll talk about why the t-distribution is used in a bit… Copyright 2017 Joshua Starmer,

131 Copyright 2017 Joshua Starmer, https://statquest.org
So, using a t-distribution, we calculate “unscaled” similarity scores for all the points and then scale them like before. Copyright 2017 Joshua Starmer,

132 Copyright 2017 Joshua Starmer, https://statquest.org
= High similarity = Low similarity Like before, we end up with a matrix of similarity scores, but this matrix is a mess… Copyright 2017 Joshua Starmer,

133 Copyright 2017 Joshua Starmer, https://statquest.org
= High similarity = Low similarity = High similarity = Low similarity …compared to the original matrix. Like before, we end up with a matrix of similarity scores, but this matrix is a mess… Copyright 2017 Joshua Starmer,

134 Copyright 2017 Joshua Starmer, https://statquest.org
The goal of moving this point is… Copyright 2017 Joshua Starmer,

135 Copyright 2017 Joshua Starmer, https://statquest.org
The goal of moving this point is… we want to make this row… Copyright 2017 Joshua Starmer,

136 Copyright 2017 Joshua Starmer, https://statquest.org
The goal of moving this point is… we want to make this row… look like this row. Copyright 2017 Joshua Starmer,

137 Copyright 2017 Joshua Starmer, https://statquest.org
t-SNE moves the points a little bit at a time, and each step it chooses a direction that makes the matrix on the left more like the matrix on the right. Copyright 2017 Joshua Starmer,

138 Copyright 2017 Joshua Starmer, https://statquest.org
t-SNE moves the points a little bit at a time, and each step it chooses a direction that makes the matrix on the left more like the matrix on the right. Copyright 2017 Joshua Starmer,

139 Copyright 2017 Joshua Starmer, https://statquest.org
t-SNE moves the points a little bit at a time, and each step it chooses a direction that makes the matrix on the left more like the matrix on the right. It uses small steps, because it’s a little bit like a chess game and can’t be solved all at once. Instead, it goes one move at at time. Copyright 2017 Joshua Starmer,

140 Copyright 2017 Joshua Starmer, https://statquest.org
BAM!!! Copyright 2017 Joshua Starmer,

141 Copyright 2017 Joshua Starmer, https://statquest.org
Now to finally tell you why the “t-distribution” is used… Copyright 2017 Joshua Starmer,

142 Copyright 2017 Joshua Starmer, https://statquest.org
…originally, the “SNE” algorithm just used a normal distribution throughout and the clusters clumped up in the middle and were harder to see. Copyright 2017 Joshua Starmer,

143 Copyright 2017 Joshua Starmer, https://statquest.org
The t-distribution forces some space between the points. Copyright 2017 Joshua Starmer,

144 Copyright 2017 Joshua Starmer, https://statquest.org
Triple Bam!!! Copyright 2017 Joshua Starmer,

145 Copyright 2017 Joshua Starmer, https://statquest.org
The End!!! Copyright 2017 Joshua Starmer,


Download ppt "StatQuest: t-SNE Clearly Expalined!!!!"

Similar presentations


Ads by Google