Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 8.3 Suppose X 1, X 2,..., X n are a random sample from a distribution defined by the p.d.f. f(x)for a < x < b and corresponding distribution function.

Similar presentations


Presentation on theme: "Section 8.3 Suppose X 1, X 2,..., X n are a random sample from a distribution defined by the p.d.f. f(x)for a < x < b and corresponding distribution function."— Presentation transcript:

1 Section 8.3 Suppose X 1, X 2,..., X n are a random sample from a distribution defined by the p.d.f. f(x)for a < x < b and corresponding distribution function F(x), The random variables which order the sample from smallest to largest Y 1 < Y 2 <...< Y n are called the order statistics. Suppose n = 2. The space of (X 1, X 2 ) is The space of (Y 1, Y 2 ) is {(x 1, x 2 ) | a < x 1 < b, a < x 2 < b} {(y 1, y 2 ) | a < y 1 < y 2 < b } For a subset A of the space of (Y 1, Y 2 ), we have that P[(Y 1, Y 2 )  A] =

2 P[{(X 1, X 2 )  A}  {(X 2, X 1 )  A}] = P[(X 1, X 2 )  A] + P[(X 2, X 1 )  A] = 2 P[(X 1, X 2 )  A] = A 2dx 1 dx 2 =f(x 1 ) f(x 2 ) A 2 f(y 1 ) f(y 2 ) dy 1 dy 2 Therefore, the joint p.d.f. of (Y 1, Y 2 ) must be g(y 1, y 2 ) = 2 f(y 1 ) f(y 2 )if a < y 1 < y 2 < b

3 To find the p.d.f. for Y 1, we first find the distribution function G 1 (y) = P(Y 1 ≤ y) = P[min(X 1, X 2 ) ≤ y] = 1 – P[min(X 1, X 2 )  y] = 1 – P(X 1  y  X 2  y) =1 – P(X 1  y) P(X 2  y) = 1 – [1 – P(X 1 ≤ y)] [1 – P(X 2 ≤ y)] =1 – [1 – F(y)] [1 – F(y)] = 1 – [1 – F(y)] 2 The p.d.f. for Y 1 is g 1 (y) = d —G 1 (y) = dy – 2[1 – F(y)][– f(y)] =2[1 – F(y)]f(y) if a < y < b

4 To find the p.d.f. for Y 2, we first find the distribution function G 2 (y) = P(Y 2 ≤ y) = P[max(X 1, X 2 ) ≤ y] = P(X 1 ≤ y  X 2 ≤ y) = P(X 1 ≤ y) P(X 2 ≤ y) =F(y)F(y) = [F(y)] 2 The p.d.f. for Y 2 is g 2 (y) = d —G 2 (y) = dy 2F(y)f(y)2F(y)f(y) if a < y < b

5 F(x) = if x  0 0 if 0 < x  1 x2x2 if 1 < x1 f(x) is a beta p.d.f. with  =and  =. 21 1. (a) (b) Suppose X 1, X 2 is a random sample from a distribution defined by the p.d.f. f(x) = 2x if 0 < x < 1. Let Y 1, Y 2 be the order statistics of the sample. Is f(x) a beta p.d.f., and if yes, for what values of  and  ? Find the distribution function corresponding to the p.d.f.

6 (c) (d) (e) (f) Find the joint p.d.f. of the order statistics (Y 1, Y 2 ). The joint p.d.f. of Y 1, Y 2 is g(y 1, y 2 ) = 2(2y 1 )(2y 2 ) = 8y 1 y 2 if 0 < y 1 < y 2 < 1 Find the p.d.f. of Y 1. Find the p.d.f. of Y 2. Is either the p.d.f. of Y 1 or the p.d.f. of Y 2 a beta p.d.f., and if yes, for what values of  and  ? The p.d.f. of Y 1 isg 1 (y) = 2[1 – y 2 ](2y) = 4y(1 – y 2 )if 0 < y < 1 The p.d.f. of Y 2 isg 2 (y) = 2[y 2 ](2y) = 4y 3 if 0 < y < 1 The p.d.f. of Y 1 isnot a beta p.d.f. The p.d.f. of Y 2 is a beta p.d.f. with  = and  =. 41

7 Suppose n = 3. The space of (X 1, X 2, X 3 ) is The space of (Y 1, Y 2, Y 3 ) is {(x 1, x 2, x 3 ) | a < x 1 < b, a < x 2 < b, a < x 3 < b} {(y 1, y 2, y 3 ) | a < y 1 < y 2 < y 3 < b} For a subset A of the space of (Y 1, Y 2, Y 3 ), we have that P[(Y 1, Y 2, Y 3 )  A] = P[{(X 1, X 2, X 3 )  A}  {(X 1, X 3, X 2 )  A}  {(X 2, X 1, X 3 )  A}  {(X 2, X 3, X 1 )  A}  {(X 3, X 1, X 2 )  A}  {(X 3, X 2, X 1 )  A}] = P[(X 1, X 2, X 3 )  A] + P[(X 1, X 3, X 2 )  A] + P[(X 2, X 1, X 3 )  A] + P[(X 2, X 3, X 1 )  A] + P[(X 3, X 1, X 2 )  A] + P[(X 3, X 2, X 1 )  A] = 6P[(X 1, X 2, X 3 )  A] = A 6dx 1 dx 2 dx 3 =f(x 1 ) f(x 2 ) f(x 3 )

8 6P[(X 1, X 2, X 3 )  A] = A 6dx 1 dx 2 dx 3 =f(x 1 ) f(x 2 ) f(x 3 ) A 6 f(y 1 ) f(y 2 ) f(y 3 ) dy 1 dy 2 dy 3 g(y 1, y 2, y 3 ) = 6 f(y 1 ) f(y 2 ) f(y 3 )if a < y 1 < y 2 < y 3 < b Therefore, the joint p.d.f. of (Y 1, Y 2, Y 3 ) must be

9 To find the p.d.f. for Y 1, we first find the distribution function G 1 (y) = P(Y 1 ≤ y) = P[min(X 1, X 2, X 3 ) ≤ y] = 1 – P[min(X 1, X 2, X 3 )  y] =1 – P(X 1  y  X 2  y  X 3  y) = 1 – P(X 1  y) P(X 2  y) P(X 3  y) = 1 – [1 – P(X 1 ≤ y)] [1 – P(X 2 ≤ y)] [1 – P(X 3 ≤ y)] = 1 – [1 – F(y)] [1 – F(y)] [1 – F(y)] =1 – [1 – F(y)] 3 The p.d.f. for Y 1 is g 1 (y) = d —G 1 (y) = dy – 3[1 – F(y)] 2 [– f(y)] =3[1 – F(y)] 2 f(y) if a < y < b

10 To find the p.d.f. for Y 2, we first find the distribution function G 2 (y) = P(Y 2 ≤ y) = P[at least two of X 1, X 2, X 3 are  y] = 3232 [ ] 2 [1 – ] 1 + [ ] 3 [1 – ] 0 = 3333 F(y)F(y)F(y)F(y)F(y)F(y)F(y)F(y) 3[F(y)] 2 [1 – F(y)] + [F(y)] 3 The p.d.f. for Y 2 is g 2 (y) = d —G 2 (y) = dy 6[F(y)]f(y)[1 – F(y)] + 3[F(y)] 2 [– f(y)] +3[F(y)] 2 f(y) = 6[F(y)] [1 – F(y)] f(y)if a < y < b

11 To find the p.d.f. for Y 3, we first find the distribution function G 3 (y) = P(Y 3 ≤ y) = P[max(X 1, X 2, X 3 ) ≤ y] = P(X 1 ≤ y  X 2 ≤ y  X 3 ≤ y) = P(X 1 ≤ y) P(X 2 ≤ y) P(X 3 ≤ y) =[F(y)] 3 The p.d.f. for Y 3 is g 3 (y) = d —G 3 (y) = dy 3[F(y)] 2 f(y)if a < y < b

12 Suppose n is any integer greater than 1. The space of (X 1, X 2, …, X n ) is The space of (Y 1, Y 2, …, Y n ) is {(x 1, x 2, …, x n ) | a < x 1 < b, a < x 2 < b, …, a < x n < b} {(y 1, y 2, …, y n ) | a < y 1 < y 2 < … < y n < b} For a subset A of the space of (Y 1, Y 2, …, Y n ), we have that P[(Y 1, Y 2, …, Y n )  A] = P[{(X 1, X 2, …, X n )  A}  {(X 2, X 1, …, X n )  A}  …] = n! P[(X 1, X 2, …, X n )  A] = A n! … dx 1 dx 2 … dx n =f(x 1 ) f(x 2 ) … f(x n ) A … n! dy 1 dy 2 … dy n f(y 1 ) f(y 2 ) … f(y n )

13 Therefore, the joint p.d.f. of (Y 1, Y 2, …, Y n ) must be g(y 1, y 2, …, y n ) = n! f(y 1 ) f(y 2 ) … f(y n )if a < y 1 < y 2 < … < y n < b Suppose r is any integer from 1 to n. To find the p.d.f. for Y r, we first find the distribution function G r (y) = P(Y r  y) = P [at least r of X 1, X 2, …, X n are  y] = n  [ ] k [1 – ] n–k k = r nknk F(y)F(y)F(y)F(y)

14 The p.d.f. for Y r is g r (y) = d —G r (y) = dy n  [ ] k [1 – ] n–k k = r nknk F(y)F(y)F(y)F(y) d — dy n – 1  [ ] k [1 – ] n–k +[F(y)] n = k = r nknk F(y)F(y)F(y)F(y) d — dy d — dy n – 1  k = r n! ———— k [F(y)] k–1 f(y) [1 – F(y)] n–k + k! (n – k)! n! ———— [F(y)] k (n – k) [1 – F(y)] n–k–1 [– f(y)] + k! (n – k)! n [F(y)] n–1 f(y) = Observe that when k = r this second term is the negative of the preceding term when k = r + 1. This pattern continues until k = n – 1 when this second term is the negative of the isolated term. =

15 if a < y < b Consequently, the p.d.f. for Y r is g r (y) = n! —————— [F(y)] r–1 [1 – F(y)] n–r f(y) (r – 1)! (n – r)! Now, go to Exercise #2:

16 Suppose the random sample X 1, X 2, X 3, X 4, X 5 is from a distribution defined by the p.d.f. f(x) = 2x if 0 < x < 1. Let Y 1, Y 2, Y 3, Y 4, Y 5 be the order statistics of the sample. Find the joint p.d.f. of the order statistics (Y 1, Y 2, Y 3, Y 4, Y 5 ). 2. (a) (b) The joint p.d.f. of Y 1, Y 2, Y 3, Y 4, Y 5 is g(y 1, y 2, y 3, y 4, y 5 ) = 3840 y 1 y 2 y 3 y 4 y 5 if 0 < y 1 < y 2 < y 3 < y 4 < y 5 < 1 Find the p.d.f. of Y 1. The p.d.f. of Y 1 isg 1 (y) = 5! —————— [y 2 ] 1–1 [1 – y 2 ] 5–1 (2y) = (1 – 1)! (5 – 1)! 10y(1 – y 2 ) 4 if 0 < y < 1

17 (c) (d) Find the p.d.f. of Y 5. Find the p.d.f. of Y 3. The p.d.f. of Y 5 isg 5 (y) = 5! —————— [y 2 ] 5–1 [1 – y 2 ] 5–5 (2y) = (5 – 1)! (5 – 5)! 10y 9 if 0 < y < 1 The p.d.f. of Y 3 isg 3 (y) = 5! —————— [y 2 ] 3–1 [1 – y 2 ] 5–3 (2y) = (3 – 1)! (5 – 3)! 60y 5 (1 – y 2 ) 2 if 0 < y < 1

18 2.-continued (e) Find P(Y 1  1/2). P(Y 1  1/2) = 0 1/2 10y(1 – y 2 ) 4 dy = 0 1/2 – 5 – 2y(1 – y 2 ) 4 dy = (1 – y 2 ) 5 – 5 ———— = 5 y = 0 1/2 3 1 –— 4 5 Note that an alternative approach is P(Y 1  1/2) = P[min(X 1, X 2, X 3, X 4, X 5 )  1/2] = 781 = —— 1024

19 P(Y 1  1/2) = P[min(X 1, X 2, X 3, X 4, X 5 )  1/2] = 1 – P[min(X 1, X 2, X 3, X 4, X 5 )  1/2] = 1 – P[X 1  1/2  …  X 5  1/2] = 1 – P[X 1  1/2] … P[X 5  1/2] = 1 – [1 – 1/4] 5 = 3 1 –— 4 5 781 = —— 1024

20 Find P(Y 5  1/2). 2.-continued (f) P(Y 5  1/2) = 0 1/2 10y 9 dy = y 10 = y = 0 1/2 1 — 2 10 Note that an alternative approach is P(Y 5  1/2) = P[max(X 1, X 2, X 3, X 4, X 5 )  1/2] = P[X 1  1/2  …  X 5  1/2] = P[X 1  1/2] … P[X 5  1/2] = ([1/2] 2 ) 5 = 1 — 2 10 1 = —— 1024

21 2.-continued (g) Find P(Y 3  1/2). P(Y 3  1/2) = 0 1/2 60y 5 (1 – y 2 ) 2 dy = Since this looks hard to integrate, we shall use an alternative approach: P(Y 3  1/2) = P[at least three of X 1, X 2, X 3, X 4, X 5 are  1/2] = 5353 [ ] 3 [1 – ] 2 + [ ] 4 [1 – ] 1 + [ ] 5 = 5454 1/4 1 3 1 3 1 10 — —+ 5— — +—= 4 4 4 4 4 32415 106 53 —— =—— 1024 512 Note that this probability can be read as 0.1035 from Table II in the appendix of the textbook.

22 Suppose the random sample X 1, X 2, …, X n is from a U(0,1) distribution. Let Y 1, Y 2, …, Y n be the order statistics of the sample. (Note: Parts of this Exercise are the same as Text Exercise 8.3-6.) Find the distribution function corresponding to the U(0, 1) distribution. 3. (a) (b) F(x) = if x  0 0 if 0 < x  1 x if 1 < x1 Find the joint p.d.f. of the order statistics (Y 1, Y 2, …, Y n ). The joint p.d.f. of Y 1, Y 2, …, Y n is g(y 1, y 2, …, y n ) = n!if 0 < y 1 < y 2 < … < y n < 1

23 (c)Find the p.d.f. of Y r where r is any integer from 1 to n. The p.d.f. of Y r isg r (y) = n! —————— y r–1 (1 – y) n–r if 0 < y < 1 (r – 1)! (n – r)! Realizing that  (n + 1) = n!,  (r) = (r – 1)!, and  (n – r + 1) = (n – r)!, we find that Y r has a distribution betar with  = and  = n – r + 1. This is essentially what Text Exercise 8.3-6(c) says to show.

24 3.-continued (d) Find the mean and variance of Y r where r is any integer from 1 to n. E(Y r ) =  —— =  +  r —— n + 1 Var(Y r ) =  ———————— = (  +  + 1)(  +  ) 2 r(n – r + 1) —————— (n + 2)(n + 1) 2

25 E(Y r+1 – Y r ) = r + 1 —— – n + 1 r —— = n + 1 1 —— n + 1 (e)Find E(Y r+1 – Y r ) where r is any integer from 1 to n – 1.

26 4. (a) Let Q have a U(0, 1) distribution. For constants b > a, define the random variable X = (b – a)Q + a. Find the distribution function for X, find the p.d.f. for X, and state what type of distribution X has. The distribution function for Q is F(q) = P(Q  q) = if q  0 0 if 0 < q  1 if 1 < q1 The space for X is The distribution function for X is G(x) = P(X  x) = {x : a < x < b}. P([b – a]Q + a  x) =P(Q  [x – a] / [b – a]) = We see then that X has a distribution. U(a, b) The p.d.f. for X isg(x) =for a < x < b q x – a —— b – a 1 —— b – a

27 (b)Let Q 1, Q 2, Q 3 be a random sample selected from the U(0, 1) distribution, and let V 1, V 2, V 3 be the order statistics. Also, let X 1 = (b – a)Q 1 + a, X 2 = (b – a)Q 2 + a, X 3 = (b – a)Q 3 + a, and let Y 1, Y 2, Y 3 be the order statistics, which implies Y 1 = (b – a)V 1 + a, Y 2 = (b – a)V 2 + a, Y 3 = (b – a)V 3 + a. State why X 1, X 2, X 3 is a random sample, use part (a) to find the type of distribution this random sample is from, and use Class Exercise #3 to find E(Y 1 ), Var(Y 1 ), E(Y 2 ), Var(Y 2 ), E(Y 3 ), Var(Y 3 ), and E(Y 1 Y 3 ). Since Q 1, Q 2, Q 3 are independent, then X 1, X 2, X 3 are independent and this together with part (a) implies X 1, X 2, X 3 is a random sample from a U(a, b) distribution.

28 4.-continued E(Y 1 ) = Var(Y 1 ) = E(Y 2 ) = E([b – a]V 1 + a) =[b – a]E(V 1 ) + a = r [b – a]—— + a = n + 1 b + 3a ——— 4 1 [b – a]—— + a = 3 + 1 Var([b – a]V 1 + a) =(b – a) 2 Var(V 1 ) = r(n – r + 1) (b – a) 2 —————— = (n + 2)(n + 1) 2 1(3 – 1 + 1) (b – a) 2 —————— = (3 + 2)(3 + 1) 2 3(b – a) 2 ———– 80 E([b – a]V 2 + a) =[b – a]E(V 2 ) + a = r [b – a]—— + a = n + 1 b + a —— 2 [b – a]—— + a = 3 + 1

29 Var(Y 2 ) = E(Y 3 ) = Var(Y 3 ) = Var([b – a]V 2 + a) =(b – a) 2 Var(V 2 ) = r(n – r + 1) (b – a) 2 —————— = (n + 2)(n + 1) 2 2(3 – 2 + 1) (b – a) 2 —————— = (3 + 2)(3 + 1) 2 (b – a) 2 ———– 20 E([b – a]V 3 + a) =[b – a]E(V 3 ) + a = r [b – a]—— + a = n + 1 3b + a ——— 4 3 [b – a]—— + a = 3 + 1 Var([b – a]V 3 + a) =(b – a) 2 Var(V 3 ) = r(n – r + 1) (b – a) 2 —————— = (n + 2)(n + 1) 2 3(3 – 3 + 1) (b – a) 2 —————— = (3 + 2)(3 + 1) 2 3(b – a) 2 ———– 80

30 4.-continued E(Y 1 Y 3 ) =E{([b – a]V 1 + a)([b – a]V 3 + a)} = E{[b – a] 2 V 1 V 3 + a[b – a]V 1 + a[b – a]V 3 + a 2 } = [b – a] 2 E(V 1 V 3 ) + a[b – a]E(V 1 ) + a[b – a]E(V 3 ) + a 2 = To find E(V 1 V 3 ), we first recall from part (b) of Class Exercise #3 that the joint p.d.f. of (V 1, V 2, V 3 ) is g(v 1, v 2, v 3 ) = 6if 0 < v 1 < v 2 < v 3 < 1 E(V 1 V 3 ) = 6v 1 v 3 dv 1 dv 2 dv 3 = 0 v2v2 0 v3v3 0 1 0 v3v3 0 1 3v 1 2 v 3 dv 2 dv 3 = v 1 = 0 v2v2

31 0 v3v3 0 1 3v 2 2 v 3 dv 2 dv 3 = 0 1 v 2 3 v 3 dv 3 = v 2 = 0 v3v3 0 1 v 3 4 dv 3 = v 3 5 — = 5 v 3 = 0 1 1 — 5 E(Y 1 Y 3 ) =[b – a] 2 E(V 1 V 3 ) + a[b – a]E(V 1 ) + a[b – a]E(V 3 ) + a 2 = [b – a] 2 + a[b – a] + a[b – a] + a 2 = 1 — 5 1 — 4 3 — 4 [b – a] 2 ——— + ab 5

32 if a < y < b Consequently, the p.d.f. for Y r is g r (y) = n! —————— [F(y)] r–1 [1 – F(y)] n–r f(y) (r – 1)! (n – r)! Recall that the (100p)th percentile of the distribution defined by p.d.f. f(x) is a number  p such that –  pp f(x) dx= F(  p ) = p which motivates the following definition:

33 The (100p)th percentile of the sample X 1, X 2, …, X n is defined to be Y r where r = (n+1)p a weighted average of Y r and Y r+1 where r = (n+1)p Note: This definition is extended to an observed sample of values x 1, x 2, …, x n where the ordered values in the sample are represented by y 1, y 2, …, y n. if (n+1)p is not an integer if (n+1)p is an integer The detailed definition of sample order statistics was given in Section 3.2.

34 101310191021102410261028 103310351039104010431047 The location of the 40th percentile is (n + 1)p =(13)(0.40) = 5.2. 40th percentile =y 5 + (0.2)(y 6 – y 5 ) = 1026 + (0.2)(1028 – 1026) = 1026.4 The location of the 80th percentile is (n + 1)p =(13)(0.80) = 10.4. 80th percentile =y 10 + (0.4)(y 11 – y 10 ) = 1040 + (0.4)(1043 – 1040) = 1041.2 Find the 40th percentile and the 80th percentile for data of Text Example 8.3-5. 5. The detailed definition of sample order statistics was given in Section 3.2, and an Excel spreadsheet was constructed to find sample order statistics. Recall that the Excel formulas were slightly different.


Download ppt "Section 8.3 Suppose X 1, X 2,..., X n are a random sample from a distribution defined by the p.d.f. f(x)for a < x < b and corresponding distribution function."

Similar presentations


Ads by Google