Presentation is loading. Please wait.

Presentation is loading. Please wait.

Clustering.

Similar presentations


Presentation on theme: "Clustering."— Presentation transcript:

1 Clustering

2 A set 𝑋 and a function 𝑑:𝑋×𝑋→ ℝ ≥0 such that
Metric space A set 𝑋 and a function 𝑑:𝑋×𝑋→ ℝ ≥0 such that d 𝑥,𝑦 =0⇔𝑥=𝑦 𝑑 𝑥,𝑦 =𝑑(𝑦,𝑥) 𝑑 𝑥,𝑦 ≤𝑑 𝑥,𝑧 +𝑑 𝑧,𝑦

3 Examples 𝐿 2 : 𝑑 𝑥,𝑦 = 𝑥 1 − 𝑦 1 2 + 𝑥 2 − 𝑦 2 2
𝐿 2 : 𝑑 𝑥,𝑦 = 𝑥 1 − 𝑦 𝑥 2 − 𝑦 2 2 𝐿 1 : 𝑑 𝑥,𝑦 = 𝑥 1 − 𝑦 1 +| 𝑥 2 − 𝑦 2 | 𝐿 ∞ : 𝑑 𝑥,𝑦 =max 𝑥 1 − 𝑦 1 ,| 𝑥 2 − 𝑦 2 | ⁡

4 (Finite) Metric space A complete weighted graph satisfying the triangle inequality 𝑑 𝑣,𝑤 ≤𝑑 𝑣,𝑢 +𝑑(𝑢,𝑤) v w u 2 1 2

5 k-centers Given a set of 𝑛 points A of some metric space 𝑋, find a set 𝐶 of 𝑘 points in 𝑋, such that we minimize max 𝑥∈𝐴 𝑑(𝑥,𝐶) Suppose 𝑘=2

6 k-centers Given a set of 𝑛 points A of some metric space 𝑋, find a set 𝐶 of 𝑘 points in 𝑋, such that we minimize max 𝑥∈𝐴 𝑑(𝑥,𝐶) Suppose 𝑘=2

7 k-centers Given a set of 𝑛 points A of some metric space 𝑋, find a set 𝐶 of 𝑘 points in 𝑋, such that we minimize max 𝑥∈𝐴 𝑑(𝑥,𝐶) Suppose 𝑘=2

8 k-centers (alt. formulation)
Given a set of 𝑛 points A of some metric space 𝑋, find a set 𝐶 of 𝑘 congruent disks (centered at points of 𝑋) of minimum radius 𝑟 that cover A

9 k-centers NP-hard to approximate to within a factor of 2−𝜖 for any 𝜖≥0 (simple reduction from dominating set) For the (planar) Euclidean metric also hard to approximate to within a factor > 1.822

10 Farthest fit Pick an arbitrary point 𝑥 1 as the first center
Pick the point farthest away from 𝑥 1 as the second center 𝑥 2

11 Farthest fit Pick an arbitrary point 𝑥 1 as the first center
For 𝑗=2…𝑘 pick 𝑥 𝑗 to be the point which is farthest away from 𝑥 1 ,…, 𝑥 𝑗−1

12 Example

13 Example

14 Example

15 Example

16 Example

17 Example 𝑟

18 What can we say about this ?
Theorem: 𝑂𝑃𝑇≥ 𝑟 2 𝑟

19 Proof Theorem: 𝑂𝑃𝑇≥ 𝑟 2 ≥𝑟 ≥𝑟 𝑟 ≥𝑟

20 Proof Theorem: 𝑂𝑃𝑇≥ 𝑟 2 ≥𝑟

21 Proof Theorem: 𝑂𝑃𝑇≥ 𝑟 2 ≥𝑟 ≥𝑟 ≥𝑟

22 Proof Theorem: 𝑂𝑃𝑇≥ 𝑟 2 ≥𝑟 ≥𝑟 ≥𝑟 ≥𝑟

23 Proof We have 𝑘+1 points, each pair is of distance ≥𝑟
Theorem: 𝑂𝑃𝑇≥ 𝑟 2 We have 𝑘+1 points, each pair is of distance ≥𝑟 In 𝑂𝑃𝑇 at least 2 of these points are assigned to the same center This center must be of distance ≥𝑟/2 from at least one of them ≥𝑟

24 k-means Given a set of 𝑛 points A of some metric space 𝑋, find a set 𝐶 of 𝑘 points in 𝑋, such that we minimize 𝑥∈𝐴 𝑑 2 (𝑥,𝐶) Notice: 𝑑 2 𝑥,𝑦 is not a metric 𝑧 𝑥 𝑦 1 𝑑 2 𝑥,𝑧 ≥ 𝑑 2 𝑥,𝑦 + 𝑑 2 (𝑦,𝑧) Focus on the case where 𝑑 𝑥,𝑦 is Euclidean

25 Where is the point that minimizes the sum of the squared distances ?
1-mean on the line A Where is the point that minimizes the sum of the squared distances ? min A 𝑥 1 −𝐴 𝑥 2 −𝐴 2 +…+ 𝑥 7 −𝐴 2 ? 𝐴= 𝑥 1 + 𝑥 2 +…+ 𝑥 7 7

26 1-mean in Euclidean space of higher dimension
It is the center of mass (mean)

27 2-means in the plane Fix the partition  to minimize the sum of squared distances each center must be the mean of the points in its cluster

28 Lloyd’s algorithm Most frequently used clustering algorithm

29 Lloyd’s algorithm Start with some arbitrary set of 𝑘-centers Iterate:
Assign each point to its closest center Recalculate centers: each new center is the mean of the points in a cluster

30 Example (k=3)

31 Pick initial centers

32 Assign each point to its closest center

33 Replace centers by clusters’ means

34 Assign each point to its closest center

35 Replace centers by clusters’ means

36 Assign each point to its closest center

37 Replace centers by clusters’ means
No changes  terminate

38 Properties Very easy to implement
Sum of squared distances always decreases (like local search)

39 Quality of the local opt ?
𝑘=3 𝑦 𝑧 𝑥 𝑂𝑃𝑇 𝐿 2 𝑦 𝑥 = 𝑦 2 𝑥 2 Can be made as large as we want

40 Running time Each step we have a partition of the points -- by the closest center We cannot repeat a partition in 2 different iterations Bounded by the # of possible partitions of 𝑛 points to 𝑘 clusters: 𝑘 𝑛 Is this tight ? Say for 𝑘=2 ?

41 Using Voronoi partitions argument
For any 𝑘 and 𝑑 We can prove that the complexity is 𝑂 𝑛𝑘 2𝑘𝑑

42 k-means++ (Arthur,Vassilivitskii 2007)
Pick 𝑥 uniformly at random and set 𝑇← 𝑥 While 𝑇 <𝑘 Pick 𝑥 at random with probability proportional to 𝑐 𝑥,𝑇 = min z∈𝑇 𝑥−𝑧 2 𝑇←𝑇∪{𝑥}

43 Theorem: Let 𝑇 be the centers of k-means++ then
Used in practice as initialization for Lloyd Theorem: Let 𝑇 be the centers of k-means++ then 𝐸 𝑐 𝐴,𝑇 ≤𝑂 log 𝑘 𝑐(𝐴,𝑂𝑃𝑇) Credits: The following presentation is based on lecture notes by S. Dasgupta

44 Proof Notation: 𝐴 the set of points (input) 𝑥−𝑦 2 ≡ 𝑥−𝑦 2
𝑥−𝑦 2 ≡ 𝑥−𝑦 2 𝑥𝑦≡ dot product 𝑐 𝐹,𝐵 ≡ 𝑥∈𝐹 d 2 𝑥,𝐵 (we will omit the set of clusters 𝐵 when it is clear from context)

45 The first center Lemma: Assume that the first center 𝑧 of k-means++ is in cluster 𝐶 of 𝑂𝑃𝑇 then 𝐸 𝑐 𝐶,𝑧 ∣𝑧∈𝐶 =2𝑐(𝐶,𝑂𝑃𝑇) 𝑧 𝜇 𝐶

46 A general fact Lemma: For any set 𝐶 and any 𝑧
𝑐 𝐶,𝑧 =𝑐 𝐶, 𝜇 𝐶 + 𝐶 𝑧− 𝜇 𝐶 2 𝑧 𝜇 𝐶 𝑐 𝐶,𝑧 = 𝑥∈𝐶 𝑥−𝑧 2 = 𝑥∈𝐶 ( 𝑥 2 −2𝑥𝑧)+|𝐶| 𝑧 2 𝑐 𝐶, 𝜇 𝐶 = 𝑥∈𝐶 𝑥− 𝜇 𝐶 2 = 𝑥∈𝐶 ( 𝑥 2 −2𝑥 𝜇 𝐶 )+|𝐶| 𝜇 𝐶 2

47 A general fact - Lemma: For any set 𝐶 and any 𝑧
𝑐 𝐶, 𝑧 =𝑐 𝐶, 𝜇 𝐶 + 𝐶 𝑧− 𝜇 𝐶 2 𝑧 𝜇 𝐶 𝑐 𝐶,𝑧 = 𝑥∈𝐶 𝑥−𝑧 2 = 𝑥∈𝐶 𝑥 2 −2𝑥𝑧 +|𝐶| 𝑧 2 - 𝑐 𝐶, 𝜇 𝐶 = 𝑥∈𝐶 𝑥− 𝜇 𝐶 2 = 𝑥∈𝐶 ( 𝑥 2 −2𝑥 𝜇 𝐶 ) +|𝐶| 𝜇 𝐶 2 𝑐 𝐶,𝑧 −𝑐 𝐶, 𝜇 𝐶 =2 𝜇 𝐶 𝑥∈𝐶 𝑥 −2𝑧 𝑥∈𝐶 𝑥 + 𝐶 z 2 − 𝜇 𝐶 2 = 𝐶 𝜇 𝐶 2 −2z C 𝜇 𝐶 + 𝐶 𝑧 2

48 =𝑐 𝐶, 𝜇 𝐶 + 𝑧∈𝐶 𝑧− 𝜇 𝐶 2 =2𝑐(𝐶,𝑂𝑃𝑇)
The first center Lemma: Assume that the first center 𝑧 of k-means++ is in cluster 𝐶 of 𝑂𝑃𝑇 then 𝐸 𝑐 𝐶,𝑧 ∣𝑧∈𝐶 =2𝑐(𝐶,𝑂𝑃𝑇) Proof: 𝐸 𝑐 𝐶,𝑧 ∣𝑧∈𝐶 = 𝑧∈𝐶 1 𝐶 𝑐 𝐶,𝑧 = = 1 𝐶 𝑧∈𝐶 𝑐 𝐶, 𝜇 𝐶 + 𝐶 𝑧− 𝜇 𝐶 2 =𝑐 𝐶, 𝜇 𝐶 + 𝑧∈𝐶 𝑧− 𝜇 𝐶 2 =2𝑐(𝐶,𝑂𝑃𝑇)

49 The following centers Lemma: Assume that center 𝑧 added by k-means++
(in some iteration 𝑖) is in cluster 𝐶 of 𝑂𝑃𝑇 then 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 ≤8𝑐(𝐶,𝑂𝑃𝑇) 𝑧 𝐶

50 The following centers Lemma: Assume that center 𝑧 added by k-means++
(in some iteration 𝑖) is in cluster 𝐶 of 𝑂𝑃𝑇 then 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 ≤8𝑐(𝐶,𝑂𝑃𝑇) 𝑧 𝐶 Proof: 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 = 𝑧∈𝐶 𝑐(𝑧,𝑇) 𝑐(𝐶,𝑇) 𝑥∈𝐶 min (𝑐 𝑥,𝑇 , 𝑥−𝑧 2 )

51 The following centers Proof: 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 =
𝑧∈𝐶 𝑐(𝑧,𝑇) 𝑐(𝐶,𝑇) 𝑥∈𝐶 min (𝑐 𝑥,𝑇 , 𝑥−𝑧 2 ) Claim: 𝑐 𝑧,𝑇 ≤ 2 𝐶 c C,z +𝑐(𝐶,𝑇)

52 The following centers Proof: 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 ≤
𝑧∈𝐶 𝐶 c C,z +𝑐(𝐶,𝑇) 𝑐(𝐶,𝑇) 𝑥∈𝐶 min (𝑐 𝑥,𝑇 , 𝑥−𝑧 2 ) Claim: 𝑐 𝑧,𝑇 ≤ 2 𝐶 c C,z +𝑐(𝐶,𝑇)

53 The following centers Proof: 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 ≤
𝑧∈𝐶 𝐶 c C,z +𝑐(𝐶,𝑇) 𝑐(𝐶,𝑇) 𝑥∈𝐶 min (𝑐 𝑥,𝑇 , 𝑥−𝑧 2 ) ≤ 𝑧∈𝐶 𝐶 c C,z 𝑥∈C 𝑐 𝑥,𝑇 +𝑐(𝐶,𝑇) 𝑥∈C 𝑥−𝑧 𝑐(𝐶,𝑇)

54 The following centers Proof: 𝐸 𝑐 𝐶,𝑇∪{𝑧} ∣𝑧∈𝐶 ≤
𝑧∈𝐶 𝐶 c C,z +𝑐(𝐶,𝑇) 𝑐(𝐶,𝑇) 𝑥∈𝐶 min (𝑐 𝑥,𝑇 , 𝑥−𝑧 2 ) ≤ 𝑧∈𝐶 𝐶 c C,z 𝑥∈C 𝑐 𝑥,𝑇 +𝑐(𝐶,𝑇) 𝑥∈C (𝑥−𝑧) 2 𝑐(𝐶,𝑇) = 4 𝐶 𝑧∈C 𝑐 𝐶,𝑧 ≤8𝑐 𝐶, 𝜇 𝐶 =8𝑐(𝐶,𝑂𝑃𝑇)

55 Now take the average of these inequalities over all 𝑥
The following centers 𝑧 𝑡 𝑥 𝐶 Claim: 𝑐 𝑧,𝑇 ≤ 2 𝐶 c C,z +𝑐(𝐶,𝑇) Proof: Let 𝑥∈𝐶, and 𝑡∈𝑇 its closest center 𝑐 𝑧,𝑇 ≤ 𝑧−𝑡 2 ≤ 𝑧−𝑥 + 𝑥−𝑡 2 ≤ 2 (𝑧−𝑥) 2 + 2(𝑥−𝑡) 2 =2 𝑧−𝑥 2 +2𝑐(𝑥,𝑇) Now take the average of these inequalities over all 𝑥

56 Uncovered clusters

57 Potential Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 𝑊 𝑡 = #(wasted centers at time 𝑡)
Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 𝑊 𝑡 = #(wasted centers at time 𝑡) 𝑈 𝑡 = uncovered clusters at time 𝑡 Notation: 𝑐 𝑡 𝐹 ≡𝑐 𝐹, 𝑇 𝑡 = 𝑥∈𝐹 𝑑 2 𝑥, 𝑇 𝑡 where 𝑇 𝑡 is the 𝑡 first centers of k-means++

58 Potential Ψ 0 = 𝑊 0 𝑐 0 ( 𝑈 0 ) 𝑈 0 𝑊 0 =0 𝑈 0 =6

59 Potential Ψ 1 = 𝑊 1 𝑐 1 ( 𝑈 1 ) 𝑈 1 𝑊 1 =0 𝑈 1 =5

60 Potential Ψ 2 = 𝑊 2 𝑐 2 ( 𝑈 2 ) 𝑈 2 𝑊 2 =1 𝑈 2 =5

61 Potential Ψ 3 = 𝑊 3 𝑐 3 ( 𝑈 3 ) 𝑈 3 𝑊 3 =1 𝑈 3 =4

62 Potential Ψ 6 = 𝑊 6 𝑐 6 ( 𝑈 6 ) 𝑈 6 𝑊 6 =2 𝑈 6 =2

63 Ψ 𝑘 = 𝑊 𝑘 𝑐 𝑘 ( 𝑈 𝑘 ) 𝑈 𝑘 = 𝑐 𝑘 ( 𝑈 𝑘 )
At the end.. 𝑊 𝑘 = 𝑈 𝑘 Ψ 𝑘 = 𝑊 𝑘 𝑐 𝑘 ( 𝑈 𝑘 ) 𝑈 𝑘 = 𝑐 𝑘 ( 𝑈 𝑘 )

64 Potential Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 𝑊 𝑡 = #(wasted centers at time 𝑡)
Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 𝑊 𝑡 = #(wasted centers at time 𝑡) 𝑈 𝑡 = uncovered clusters at time 𝑡 𝐻 𝑡 = covered clusters at time 𝑡 Ψ 0 = 𝑊 0 =0 Ψ 𝑘 = 𝑐 𝑘 ( 𝑈 𝑘 ) E c k A =E c k H k +E[ c k U k ]= E c k H k +E Ψ 𝑘 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 +E Ψ 𝑘

65 Potential Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 E Ψ 𝑘 = 𝑡=0 𝑘−1 𝐸 Ψ 𝑘+1 − Ψ 𝑘

66 We bound the increase in Ψ 𝑡
𝑘

67 We bound the increase in Ψ 𝑡
𝐸 Ψ 𝑡+1 − Ψ 𝑡 = ? Conditioned on the current state (previous random choices)

68 Case 1: We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? 𝑧 is the center chosen at iteration 𝑡+1

69 Case 1: We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? 𝑧 is the center chosen at iteration 𝑡+1 z

70 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1

71 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1

72 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1

73 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1

74 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡 −1

75 We hit an uncovered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 = ? Ψ 𝑡 = 𝑊 𝑡 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 Ψ 𝑡+1 = 𝑊 𝑡 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡 −1 ≤ 𝑊 𝑡 ( 𝑐 𝑡 𝑈 𝑡 − 𝑐 𝑡 𝐶 𝑧 ) 𝑈 𝑡 −1 z

76 We hit an uncovered center
Ψ 𝑡+1 = 𝑊 𝑡 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡 −1 ≤ 𝑊 𝑡 ( 𝑐 𝑡 𝑈 𝑡 − 𝑐 𝑡 𝐶 𝑧 ) 𝑈 𝑡 −1 z 𝐸 𝑐 𝑡 𝐶 𝑧 ≥ 𝑐 𝑡 ( 𝑈 𝑡 ) | 𝑈 𝑡 | Claim:

77 From the claim we get 𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 ≤
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 ≤ 𝑊 𝑡 ( 𝑐 𝑡 𝑈 𝑡 −𝐸 𝑐 𝑡 𝐶 𝑧 ) 𝑈 𝑡 −1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 − 𝑐 𝑡 ( 𝑈 𝑡 ) | 𝑈 𝑡 | 𝑈 𝑡 −1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 =0

78 𝑐 𝑡 𝑈 𝑡 2 = 1,…,1 𝑐 𝑡 𝐶 1 ,…, 𝑐 𝑡 𝐶 𝑈 𝑡 2 ≤ 𝑈 𝑡 𝑗=1 | 𝑈 𝑡 | 𝑐 𝑡 𝐶 𝑗 2
Proof of the claim 𝐸 𝑐 𝑡 𝐶 𝑧 ≥ 𝑐 𝑡 ( 𝑈 𝑡 ) | 𝑈 𝑡 | Claim: Proof: 𝐸 𝑐 𝑡 𝐶 𝑧 = 𝑥∈ 𝑉(𝑈 𝑡 ) 𝑐 𝑡 𝑥 𝑐 𝑡 𝑈 𝑡 𝑐 𝑡 𝐶 𝑥 = 𝑗=1 | 𝑈 𝑡 | 𝑐 𝑡 𝐶 𝑗 𝑐 𝑡 𝑈 𝑡 𝑐 𝑡 𝐶 𝑗 ≥ 𝑐 𝑡 ( 𝑈 𝑡 ) | 𝑈 𝑡 | 𝑐 𝑡 𝑈 𝑡 2 = 1,…,1 𝑐 𝑡 𝐶 1 ,…, 𝑐 𝑡 𝐶 𝑈 𝑡 ≤ 𝑈 𝑡 𝑗=1 | 𝑈 𝑡 | 𝑐 𝑡 𝐶 𝑗 2

79 Case 2: We hit a covered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧 ∉ 𝑈 𝑡 = ? 𝑧 is the center chosen at iteration 𝑡+1

80 Case 2: We hit a covered center
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧 ∉ 𝑈 𝑡 = ? 𝑧 is the center chosen at iteration 𝑡+1 z

81 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 =
From the claim we get Ψ 𝑡+1 − Ψ 𝑡 = 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 =

82 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 =
From the claim we get Ψ 𝑡+1 − Ψ 𝑡 = 𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 =

83 We hit a covered center Ψ 𝑡+1 − Ψ 𝑡 =
𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡

84 We hit a covered center Ψ 𝑡+1 − Ψ 𝑡 =
𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡

85 We hit a covered center Ψ 𝑡+1 − Ψ 𝑡 =
𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡 ) 𝑈 𝑡 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡

86 We hit a covered center Ψ 𝑡+1 − Ψ 𝑡 =
𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡 ) 𝑈 𝑡 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡

87 We hit a covered center Ψ 𝑡+1 − Ψ 𝑡 =
𝑊 𝑡+1 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = ( 𝑊 𝑡 +1) 𝑐 𝑡+1 ( 𝑈 𝑡+1 ) 𝑈 𝑡+1 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 ≤ ( 𝑊 𝑡 +1) 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡 − 𝑊 𝑡 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 = 𝑐 𝑡 ( 𝑈 𝑡 ) 𝑈 𝑡

88 We bound the increase in Ψ 𝑡
𝐸 Ψ 𝑡+1 − Ψ 𝑡 = 𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∈ 𝑈 𝑡 Pr 𝑧∈ 𝑈 𝑡 + 𝐸 Ψ 𝑡+1 − Ψ 𝑡 ∣𝑧∉ 𝑈 𝑡 Pr 𝑧∉ 𝑈 𝑡 ≤ 0+ 𝑐 𝑡 𝑈 𝑡 𝑈 𝑡 𝑐 𝑡 𝐻 𝑡 𝑐 𝑡 𝑇 𝑡 ≤ 𝑐 𝑡 𝐻 𝑡 𝑈 𝑡

89 We bound the increase in Ψ 𝑡
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ≤ 𝑐 𝑡 𝐻 𝑡 𝑈 𝑡 Recall: E c k H t +E Ψ 𝑘 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 +E Ψ 𝑘 =8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 𝑡=0 𝑡=𝑘−1 𝐸 Ψ 𝑡+1 − Ψ 𝑡 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 𝑐 0 𝐻 𝑈 𝑐 1 𝐻 𝑈 1 +…+ 𝑐 𝑘−1 𝐻 𝑘−1 𝑈 𝑘−1 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 𝑐 0 𝐻 0 𝑘 + 𝑐 1 𝐻 1 𝑘−1 +…+ 𝑐 𝑘−1 𝐻 𝑘−1 1

90 We bound the increase in Ψ 𝑡
𝐸 Ψ 𝑡+1 − Ψ 𝑡 ≤ 𝑐 𝑡 𝐻 𝑡 𝑈 𝑡 Recall: E c k H t +E Ψ 𝑘 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 +E Ψ 𝑘 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 𝑡=0 𝑡=𝑘−1 𝐸 Ψ 𝑡+1 − Ψ 𝑡 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 8 𝑐 0 𝐻 𝑈 𝑐 1 𝐻 𝑈 1 +…+ 8 𝑐 𝑘−1 𝐻 𝑘−1 𝑈 𝑘−1 ≤8𝑐 𝐻 𝑘 ,𝑂𝑃𝑇 + 8𝑐 𝐻 0 ,𝑂𝑃𝑇 𝑘 + 8𝑐 𝐻 1 ,𝑂𝑃𝑇 𝑘−1 +…+ 8𝑐 𝐻 𝑘−1 ,𝑂𝑃𝑇 1 ≤𝑂( log 𝑘) 𝑂𝑃𝑇

91 Clustering a stream Data arrived as a stream
Want to compute k-centers using small memory

92 The cover tree (Beygelzimer et al 2006)
Assume (for simplicity) that all distances ≤1 𝑥 1 𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 5 𝑥 2 𝑥 4 𝑥 3 𝑥 3

93 Invariants 𝑥 1 𝑥 2 𝑥 3 𝑥 5 𝑥 4 1 1 2 1 2 𝑥 4 1 4 1 4 1 8 Each node is associated with a point 𝑥 𝑖 and each point is associated with some node. If a nodes is associated with 𝑥 𝑖 then one of its children is associated with 𝑥 𝑖 All nodes at depth 𝑗 are at distance at least 𝑗 from each other Each node at depth 𝑗+1 is within distance 𝑗 of its parent

94 Can be constructed online
𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 3 To add 𝑥 find the largest 𝑗 such that 𝑥 is within 𝑗 from some node 𝑝 at level 𝑗 Make 𝑥 a child of 𝑝

95 Can be constructed online
𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 3 To add 𝑥 find the largest 𝑗 such that 𝑥 is within 𝑗 from some node 𝑝 at level 𝑗 Make 𝑥 a child of 𝑝

96 Can be constructed online
𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 3 To add 𝑥 find the largest 𝑗 such that 𝑥 is within 𝑗 from some node 𝑝 at level 𝑗 Make 𝑥 a child of 𝑝 Prove that this maintains the invariants…

97 How do we use this for k-center ?
𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 3 Use the deepest level with ≤𝑘 nodes, call this set of nodes 𝑇 𝑘 Thm: 𝒄 𝑻 𝒌 ≤𝟖⋅𝑶𝑷𝑻

98 How do we use this for k-center ?
𝑥 1 𝑥 4 𝑥 5 𝑥 2 𝑥 3 Thm: 𝒄 𝑻 𝒌 ≤𝟖⋅𝑶𝑷𝑻 Proof: Say this is level 𝑗 Any node is at distance of ≤ 1 2 𝑗 𝑗+1 +…≤ 1 2 𝑗−1 of some node at level 𝑗 At level 𝑗+1 we have ≥𝑘+1 nodes with pairwise distances ≥ 1 2 𝑗+1  𝑂𝑃𝑇≥ 1 2 𝑗+2


Download ppt "Clustering."

Similar presentations


Ads by Google