Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005.

Slides:



Advertisements
Similar presentations
Random Processes Introduction (2)
Advertisements

Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.
ORDER STATISTICS.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
STAT 497 APPLIED TIME SERIES ANALYSIS
Introduction to stochastic process
4.7 Brownian Bridge 報告者 : 劉彥君 Gaussian Process Definition 4.7.1: A Gaussian process X(t), t ≥ 0, is a stochastic process that has the property.
Probability theory 2010 Main topics in the course on probability theory  Multivariate random variables  Conditional distributions  Transforms  Order.
1 Chap 5 Sums of Random Variables and Long-Term Averages Many problems involve the counting of number of occurrences of events, computation of arithmetic.
Probability theory 2010 Outline  The need for transforms  Probability-generating function  Moment-generating function  Characteristic function  Applications.
Group problem solutions 1.(a) (b). 2. In order to be reversible we need or equivalently Now divide by h and let h go to Assuming (as in Holgate,
Section 10.6 Recall from calculus: lim= lim= lim= x  y  — x x — x kx k 1 + — y y eekek (Let y = kx in the previous limit.) ekek If derivatives.
Probability theory 2008 Conditional probability mass function  Discrete case  Continuous case.
Probability theory 2011 Outline of lecture 7 The Poisson process  Definitions  Restarted Poisson processes  Conditioning in Poisson processes  Thinning.
A random variable that has the following pmf is said to be a binomial random variable with parameters n, p The Binomial random variable.
Probability theory 2011 Convergence concepts in probability theory  Definitions and relations between convergence concepts  Sufficient conditions for.
The moment generating function of random variable X is given by Moment generating function.
Some standard univariate probability distributions
Approximations to Probability Distributions: Limit Theorems.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
R. Kass/S07 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
Review of Probability.
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
References for M/G/1 Input Process
CS433: Modeling and Simulation
Al-Imam Mohammad Ibn Saud University
The Poisson Process. A stochastic process { N ( t ), t ≥ 0} is said to be a counting process if N ( t ) represents the total number of “events” that occur.
Limits and the Law of Large Numbers Lecture XIII.
All of Statistics Chapter 5: Convergence of Random Variables Nick Schafer.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
DATA ANALYSIS Module Code: CA660 Lecture Block 3.
Tch-prob1 Chap 3. Random Variables The outcome of a random experiment need not be a number. However, we are usually interested in some measurement or numeric.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).
Continuous Distributions The Uniform distribution from a to b.
Borgan and Henderson:. Event History Methodology
Convergence in Distribution
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
1 ORDER STATISTICS AND LIMITING DISTRIBUTIONS. 2 ORDER STATISTICS Let X 1, X 2,…,X n be a r.s. of size n from a distribution of continuous type having.
Expectation for multivariate distributions. Definition Let X 1, X 2, …, X n denote n jointly distributed random variable with joint density function f(x.
DISCRETE RANDOM VARIABLES.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Week 121 Law of Large Numbers Toss a coin n times. Suppose X i ’s are Bernoulli random variables with p = ½ and E(X i ) = ½. The proportion of heads is.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 EE571 PART 3 Random Processes Huseyin Bilgekul Eeng571 Probability and astochastic Processes Department of Electrical and Electronic Engineering Eastern.
Week 111 Some facts about Power Series Consider the power series with non-negative coefficients a k. If converges for any positive value of t, say for.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Chebyshev’s Inequality Markov’s Inequality Proposition 2.1.
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Lecture 21 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
R. Kass/W04 P416 Lec 3 1 Lecture 3 The Gaussian Probability Distribution Function Plot of Gaussian pdf x p(x)p(x) Introduction l The Gaussian probability.
4-1 Continuous Random Variables 4-2 Probability Distributions and Probability Density Functions Figure 4-1 Density function of a loading on a long,
Large Sample Distribution Theory
Wiener Processes and Itô’s Lemma
Random Variable 2013.
Lecture 3 B Maysaa ELmahi.
水分子不時受撞,跳格子(c.p. 車行) 投骰子 (最穩定) 股票 (價格是不穏定,但報酬過程略穩定) 地震的次數 (不穩定)
Exponential Distribution & Poisson Process
MTH 161: Introduction To Statistics
Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.
The Gaussian Probability Distribution Function
The Bernoulli distribution
t distribution Suppose Z ~ N(0,1) independent of X ~ χ2(n). Then,
Some Rules for Expectation
Brownian Motion for Financial Engineers
Tutorial 7: General Random Variables 3
Random WALK, BROWNIAN MOTION and SDEs
Moments of Random Variables
Presentation transcript:

Kakutani’s interval splitting scheme Willem R. van Zwet University of Leiden Bahadur lectures, Chicago 2005

Kakutani's interval splitting scheme Random variables X 1, X 2, … : X 1 has a uniform distribution on (0,1); Given X 1, X 2, …, X k-1, the conditional distribution of X k is uniform on the longest of the k subintervals created by 0, 1, X 1, X 2, …,X k-1.

0———x 1 ——————————1 0———x 1 ———————x 2 ——1 0———x 1 ——x 3 ————x 2 ——1 0———x 1` ——x 3 ——x 4 —x 2 ——1 Kakutani (1975): As n →∞, do these points become evenly (i.e. uniformly) distributed in (0,1)?

Empirical distribution function of X 1,X 2,…,X n : F n (x) = n -1 Σ 1  i  n 1 (0, x] ( X i ). Uniform d.f. on (0,1) : F(x) = P(X 1  x) = x, x  (0,1). Formal statement of Kakutani's question: (*) lim n  sup 0<x<1  F n (x) - x   0 with probability 1 ?

We know : if X 1,X 2,…,X n are independent and each is uniformly distributed on (0,1), then (*) is true (Glivenko-Cantelli). So (*) is "obviously" true in this case too!! However, distribution of first five points already utterly hopeless!!

Stopping rule: 0<t<1 : N t = first n for which all subintervals have length  t ; t  1 : N t = 0. Stopped sequence X 1, X 2,…, X N(t) has the property that any subinterval will receive another random point before we stop iff it is longer than t.

May change order and blow up: Given that X 1 =x 0———x—————————1 L( N t  X 1 =x) = L( N t/x + N* t/(1-x) + 1), 0<t<1 \ / independent copies L(Z) indicates the distribution (law) of Z.

L( N t  X 1 =x) = L( N t/x + N t/(1-x) + 1), 0<t<1 \ / independent copies  (t) = E N t =  (0,1) {  (t/x) +  (t/(1-x)) + 1} dx   (t) = E N t = (2/t) - 1, 0 <t< 1. Similarly  2 (t) = E (N t -  (t)) 2 = c/t, 0 <t<  1/2.

 (t) = E N t = (2/t) - 1, 0 <t< 1.  2 (t) = E (N t -  (t)) 2 = c/t, 0 <t  1/2.  E{N t / (2/t)} = 1 - t/2 → 1 as t → 0,  2 (N t / (2/t)) = ct/4 → 0 as t → 0.  lim t→0 N t / (2/t) = 1 w.p. 1 We have built a clock! As t→0, the longest interval (length t) tells the time n ~ (2/t).

N t ~ (2/t) as t → 0 w.p. 1. Define N t N t (x) = Σ 1 (0, x] ( X i ), x  (0,1). i=1 N t (x) ~ N t/x ~ (2x/t) as t → 0 w.p. 1. N t (x): 0—.—.—x—.——.—.——.—.——1 N t/x : 0—.—-—x—.——.—.——.—.——1  F Nt (x) = N t (x) / N t → x as t→0 w.p. 1

F Nt (x) = N t (x) / N t → x as t→0 w.p. 1. and as N t → ∞ when t→0, F n (x) → x as n → ∞ w.p. 1  sup  F n (x) - x   0 w.p. 1. 0<x<1 Kakutani was right (vZ Ann. Prob. 1978)

We want to show that F n (x)  x faster than in the i.i.d. uniform case. E.g. by considering the stochastic processes B n (x) = n 1/2 (F n (x) - x), 0  x  1. If X 1,X 2,…,X n independent and uniformly distributed on (0,1), then B n  D B 0 as n , where  D refers to convergence in distribution of bounded continuous functions and B 0 denotes the Brownian bridge.

Refresher course 1 W : Wiener process on [0,1], i.e. W ={ W(t): 0  t  1} with W(0) = 0 ; W(t) has a normal (Gaussian) distribution with mean zero and variance E W 2 (t) = t W has independent increments. B 0 : Brownian bridge on [0,1], i.e. B 0 ={B 0 (t): 0  t  1} is distributed as W conditioned on W(1)=0. Fact: {W(t) – t W(1): 0  t  1 } is distributed as B 0

So: If X 1,X 2,…,X n are independent and uniformly distributed on (0,1), then B n  D B 0 as n . If X 1,X 2,…,X n are generated by Kakutani's scheme, then (Pyke & vZ, Ann. Prob. 2004) B n  D a.B 0 as n , with a = ½ σ( N ½ ) = (4 log 2 - 5/2) ½ = ….. Half a Brownian bridge! Converges twice as fast!

Refresher course 2 Y : random variable with finite k-th moment μ k = E Y k = ∫ Y k dP < ∞ and characteristic function ψ(t) = E e itY = 1 + Σ 1  j  k μ j (it) j / j! + o(t k ). Then log ψ(t) = Σ 1  j  k  j (it) j / j! + o(t k ).  j : j-th cumulant  1 = μ 1 = E Y ;  2 =  2 = E(Y- μ 1 ) 2  3 = E(Y- μ 1 ) 3 ;  4 = E(Y- μ 1 )  4 ; etc.  j =0 for j  3 iff Y is normal.

If Y 1 and Y 2 are independent, then characteristic functions multiply and hence cumulants add up:  j (Y 1 +Y 2 ) =  j (Y 1 ) +  j (Y 2 ). Let Y 1, Y 2, … be i.i.d. with mean μ=EY 1 =0 and all moments finite. Define S n = n -½ (Y 1 + Y 2 + … + Y n ). Then for j≥3, κ j (S n ) = n -j/2 κ j (Σ Y i ) = n 1-j/2 κ j (Y 1 ) → 0.  S n asymptotically normal by a standard moment convergence argument. Poor man’s CLT, but sometimes very powerful.

We know E N t = (2/t) - 1, 0 <t< 1,  2 (N t ) = c/t, 0<t≤½. Similarly κ j (N t ) = c j /t, 0<t≤ 1/j, j=3,4,…. Define I s = N 1/s + 1, i.e. the number of intervals at the first time when all intervals are ≤ 1/s.Then κ j (I s ) = c j s, s>j, j=1,2,…, with c 1 = 2, c 2 = c.

κ j (I s ) = c j s, s>j, j=1,2,…, For growing s, I s behaves more and more like an independent increments process!! Define W t (x) = (t/c) ½ (N t (x) - 2x/t), 0≤x≤ 1. Then for s=1/t, and s  (i.e. t  0), W t (x)  D (t/c) ½ (N t/x - 2x/t)  D (cs) - ½ (I xs - 2xs)  D W(x) because of the cumulant argument. (the proof of tightness is very unpleasant !)

Now W t (x) - x W t (1) = (t/c) ½ {N t (x) - x N t (1)}  2(ct) - ½ {N t (x) - x N t (1)} / N t = 2(ct) - ½ (F Nt (x) - x). Hence for t=2/n and M=N 2/n  n, n ½ (F M (x) - x)  D (c/2) ½ B 0 (x) = a B 0 (x), but the randomness of M is a major obstacle !

Now M = N 2/n  n, but it is really nasty to show that n ½ sup |F M (x) - F n (x)| → P 0. But then we have n ½ (F n (x) - x)  D a. B 0 (x), with a = (c/2) ½.