Presentation is loading. Please wait.

Presentation is loading. Please wait.

Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005.

Similar presentations


Presentation on theme: "Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005."— Presentation transcript:

1 Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005

2 Kakutani's interval splitting scheme Random variables X 1, X 2, … : X 1 has a uniform distribution on (0,1); Given X 1, X 2, …, X k-1, the conditional distribution of X k is uniform on the longest of the k subintervals created by 0, 1, X 1, X 2, …,X k-1.

3 0———x 1 ——————————1 0———x 1 ———————x 2 ——1 0———x 1 ——x 3 ————x 2 ——1 0———x 1` ——x 3 ——x 4 —x 2 ——1 Kakutani (1975): As n →∞, do these points become evenly (i.e. uniformly) distributed in (0,1)?

4 Empirical distribution function of X 1,X 2,…,X n : F n (x) = n -1 Σ 1  i  n 1 (0, x] ( X i ). Uniform d.f. on (0,1) : F(x) = P(X 1  x) = x, x  (0,1). Formal statement of Kakutani's question: (*) lim n  sup 0<x<1  F n (x) - x   0 with probability 1 ?

5 We know : if X 1,X 2,…,X n are independent and each is uniformly distributed on (0,1), then (*) is true (Glivenko-Cantelli). So (*) is "obviously" true in this case too!! However, distribution of first five points already utterly hopeless!!

6 Stopping rule: 0<t<1 : N t = first n for which all subintervals have length  t ; t  1 : N t = 0. Stopped sequence X 1, X 2,…, X N(t) has the property that any subinterval will receive another random point before we stop iff it is longer than t.

7 May change order and blow up: Given that X 1 =x 0———x—————————1 L( N t  X 1 =x) = L( N t/x + N* t/(1-x) + 1), 0<t<1 \ / independent copies L(Z) indicates the distribution (law) of Z.

8 L( N t  X 1 =x) = L( N t/x + N t/(1-x) + 1), 0<t<1 \ / independent copies  (t) = E N t =  (0,1) {  (t/x) +  (t/(1-x)) + 1} dx   (t) = E N t = (2/t) - 1, 0 <t< 1. Similarly  2 (t) = E (N t -  (t)) 2 = c/t, 0 <t<  1/2.

9  (t) = E N t = (2/t) - 1, 0 <t< 1.  2 (t) = E (N t -  (t)) 2 = c/t, 0 <t  1/2.  E{N t / (2/t)} = 1 - t/2 → 1 as t → 0,  2 (N t / (2/t)) = ct/4 → 0 as t → 0.  lim t→0 N t / (2/t) = 1 w.p. 1 We have built a clock! As t→0, the longest interval (length t) tells the time n ~ (2/t).

10 N t ~ (2/t) as t → 0 w.p. 1. Define N t N t (x) = Σ 1 (0, x] ( X i ), x  (0,1). i=1 N t (x) ~ N t/x ~ (2x/t) as t → 0 w.p. 1. N t (x): 0—.—.—x—.——.—.——.—.——1 N t/x : 0—.—-—x—.——.—.——.—.——1  F Nt (x) = N t (x) / N t → x as t→0 w.p. 1

11 F Nt (x) = N t (x) / N t → x as t→0 w.p. 1. and as N t → ∞ when t→0, F n (x) → x as n → ∞ w.p. 1  sup  F n (x) - x   0 w.p. 1. 0<x<1 Kakutani was right (vZ Ann. Prob. 1978)

12 We want to show that F n (x)  x faster than in the i.i.d. uniform case. E.g. by considering the stochastic processes B n (x) = n 1/2 (F n (x) - x), 0  x  1. If X 1,X 2,…,X n independent and uniformly distributed on (0,1), then B n  D B 0 as n , where  D refers to convergence in distribution of bounded continuous functions and B 0 denotes the Brownian bridge.

13 Refresher course 1 W : Wiener process on [0,1], i.e. W ={ W(t): 0  t  1} with W(0) = 0 ; W(t) has a normal (Gaussian) distribution with mean zero and variance E W 2 (t) = t W has independent increments. B 0 : Brownian bridge on [0,1], i.e. B 0 ={B 0 (t): 0  t  1} is distributed as W conditioned on W(1)=0. Fact: {W(t) – t W(1): 0  t  1 } is distributed as B 0

14 So: If X 1,X 2,…,X n are independent and uniformly distributed on (0,1), then B n  D B 0 as n . If X 1,X 2,…,X n are generated by Kakutani's scheme, then (Pyke & vZ, Ann. Prob. 2004) B n  D a.B 0 as n , with a = ½ σ( N ½ ) = (4 log 2 - 5/2) ½ = 0.5221….. Half a Brownian bridge! Converges twice as fast!

15 Refresher course 2 Y : random variable with finite k-th moment μ k = E Y k = ∫ Y k dP < ∞ and characteristic function ψ(t) = E e itY = 1 + Σ 1  j  k μ j (it) j / j! + o(t k ). Then log ψ(t) = Σ 1  j  k  j (it) j / j! + o(t k ).  j : j-th cumulant  1 = μ 1 = E Y ;  2 =  2 = E(Y- μ 1 ) 2  3 = E(Y- μ 1 ) 3 ;  4 = E(Y- μ 1 ) 4 - 3  4 ; etc.  j =0 for j  3 iff Y is normal.

16 If Y 1 and Y 2 are independent, then characteristic functions multiply and hence cumulants add up:  j (Y 1 +Y 2 ) =  j (Y 1 ) +  j (Y 2 ). Let Y 1, Y 2, … be i.i.d. with mean μ=EY 1 =0 and all moments finite. Define S n = n -½ (Y 1 + Y 2 + … + Y n ). Then for j≥3, κ j (S n ) = n -j/2 κ j (Σ Y i ) = n 1-j/2 κ j (Y 1 ) → 0.  S n asymptotically normal by a standard moment convergence argument. Poor man’s CLT, but sometimes very powerful.

17 We know E N t = (2/t) - 1, 0 <t< 1,  2 (N t ) = c/t, 0<t≤½. Similarly κ j (N t ) = c j /t, 0<t≤ 1/j, j=3,4,…. Define I s = N 1/s + 1, i.e. the number of intervals at the first time when all intervals are ≤ 1/s.Then κ j (I s ) = c j s, s>j, j=1,2,…, with c 1 = 2, c 2 = c.

18 κ j (I s ) = c j s, s>j, j=1,2,…, For growing s, I s behaves more and more like an independent increments process!! Define W t (x) = (t/c) ½ (N t (x) - 2x/t), 0≤x≤ 1. Then for s=1/t, and s  (i.e. t  0), W t (x)  D (t/c) ½ (N t/x - 2x/t)  D (cs) - ½ (I xs - 2xs)  D W(x) because of the cumulant argument. (the proof of tightness is very unpleasant !)

19 Now W t (x) - x W t (1) = (t/c) ½ {N t (x) - x N t (1)}  2(ct) - ½ {N t (x) - x N t (1)} / N t = 2(ct) - ½ (F Nt (x) - x). Hence for t=2/n and M=N 2/n  n, n ½ (F M (x) - x)  D (c/2) ½ B 0 (x) = a B 0 (x), but the randomness of M is a major obstacle !

20 Now M = N 2/n  n, but it is really nasty to show that n ½ sup |F M (x) - F n (x)| → P 0. But then we have n ½ (F n (x) - x)  D a. B 0 (x), with a = (c/2) ½.


Download ppt "Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005."

Similar presentations


Ads by Google