Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Inference Most data comes in the form of numbers We have seen methods to describe and summarise patterns in data. Most data are samples (subsets)

Similar presentations


Presentation on theme: "Statistical Inference Most data comes in the form of numbers We have seen methods to describe and summarise patterns in data. Most data are samples (subsets)"— Presentation transcript:

1

2 Statistical Inference Most data comes in the form of numbers We have seen methods to describe and summarise patterns in data. Most data are samples (subsets) of the population of interest Random variables and their probability distributions describe patterns in populations

3 Probability Distribution of a Discrete r.v. The probabilities may be written as: P(X i =x i ) is also referred to as the density function f(x) The cumulative distribution function (c.d.f.) is defined as

4 Discrete Random Variables 1 coin toss 1 fair die throw Examples of a discrete uniform distribution X12...n f(x)1/n We now look at non-uniform distributions

5 DISCRETE DISTRIBUTIONS Example - Family of 3 children. Let X be the Random Variable (RV) = number of girls Possible values: X = 3 GGG X = 2 GGB GBG BGG X = 1 BBG BGB GBB X = 0 BBB Assume the 8 outcomes are equally likely so that x 0 1 2 3 P(X = x)1/83/8 3/81/8

6 Example - Bernoulli trials Each trial is an 'experiment' with exactly 2 possible outcomes, "success" and "failure" with probabilities p and 1-p. Let X = 1 if success, 0 if failure Probability distribution is x 0 1 P(X = x)p1-p Results for Bernoulli trials can be simulated using R e.g. simulate results of a drug trial drug, success (cure) has probability p = 0.3 for each patient, 100 patients in trial. result _ rbinom(100, size=1, prob=p) result is a 100 vector that looks like 1,0,0,1,0,1,…...

7 Example-Binomial Experiment Generalisation of Bernoulli trials X ~ Bin(n,p)  X = # of successes in n Bernoulli trials e.g. X = # of heads in 10 tosses of a coin, n =, p = e.g. X = # of boys in a family of 5 children, n=, p = e.g. X = # of sixes in 100 rolls of a dice, n=, p= possible values for X = probability distribution for X (q = 1-p) P(X = k) Binomial expansion

8 Shape of the Binomial Distribution The shape of the binomial distribution depends on the values of n and p. probdistr_ dbinom(x=0:n, size=n, prob=p)

9 Expected Value of a Random Variable If the probability distribution of a random variable X is Values of X x 1 x 2... x k Probabilities p 1 p 2... p k its expected value is e.g. Drilling for oil Well Type Probability Pay-off Dry 0.5 0 Wet 0.4 $400K Gusher 0.1 $1500K

10 Expected values of drilling Let random variable X be the financial gain = pay-off - drilling cost = pay-off - $200K The probability distribution for X is x -200 200 1300 P(X=x) 0.5 0.4 0.1 so the expected value (average) of X is E(X) = -200 x 0.5 + 200 x 0.4 + 1300 x 0.1 = $110K This is directly analogous to the sample mean E(X) can be regarded as an idealisation of, or a theoretical value for, the sample mean E(X) is often denoted by the Greek letter µ (pronounced "mu")

11 Variance of random variable Recall that variance is a measure of spread. For a sample the variance is The variance of a r.v. X is :  2 = V(X) = E(X -  ) 2  2 represents the theoretical limit of the sample variance s 2 as the sample size n becomes very large. A simpler formula for var(X) is  2 = V(X) = E(X 2 )- (E(X)) 2

12 Population equivalents of sample quantities Sample statisticPopulation parameter

13 Example - E(X) and V(X) X = # of boys in a family of 5 children X ~ Bin (5,0.5) Then the probability distribution of X is x 0 1 2 3 4 5 P(X=x)1/325/3210/3210/325/321/32 = np = npq

14 Transformations of random variables If X is a r.v., then Y = 3X is also a r.v. Values of X x 1 x 2... x k Probabilities p 1 p 2... p k Values of Y 3x 1 3x 2... 3x k In general, Y = f(X) is a r.v. with p.d.f. f Y (y)= P(Y=y) = P(X=f -1 (y)) = f X (f -1 (y)) If X,Y are r.v.’s then Z = X + Y is also a r.v. P.d.f. of Z is f Z (z) = f X *f Y (z)

15 Example - 2 dice are thrown Let X denote the sum of the results. Outcomes: 112131415161 122232425262 132333435363 142434445464 152535455565 162636465666 Assume the 36 outcomes are equally likely so each has probability = 1/36 Possible values of X are 2, 3,..., 12 e.g. P(X = 4) = P(1,3) + P(2,2) + P(3,1) = 3/36. The probability distribution is x 234...101112 P(X=x)1/362/363/36...3/362/361/36

16 More E(X) and V(X) If Y = a X + b, where X is a r.v. and a and b are known constant values, then E(Y) = a E(X) + b and V(Y) = a 2 V(X) (constant doesn’t count) e.g. X = # boys in 5 children, Y = # girls in 5 children Similarly if T = a X + b Y + c where X and Y are r.v. and a, b and c are known constants, then E(T) = a E (X) + b E (Y) + c and V(T) = a 2 V(X) + b 2 V(Y)+ 2ab Cov(X,Y) In particular, if X and Y are independent then the covariance cov(X,Y) is zero

17 2 dice continued X = sum of two dice thrown X = Y + Z, Y,Z i.i.d Unif (1:6) E(Y) = E(Z) = 3.5 V(Y) = V(Z) = E(Y 2 )-(E(Y)) 2 = 2.91 E(Z) = E(X) + E(Y) = 7 V(Z) = V(X) + V(Y) = 5.82

18 E(X) and V(X) for Binomial Let X be Bernoulli, i.e. X~Bin(1,p) E(X) = 1.p + 0.(1-p) = p E(X 2 ) = p V(X) = E(X 2 ) – (E(X)) 2 = p – p 2 = pq Now let X~Bin(n,p) X = X 1 + X 2 + ….+ X n, X i i.i.d. Bernoulli E(X) = E(X 1 ) + E(X 2 ) + ….+ E(X n ) = np V(X) = V(X 1 ) + V(X 2 ) + ….+ V(X n ) = npq

19 Difference of r.v. s A component is made by cutting a piece of metal to length X and then trimming it by amount Y. Both of these processes are somewhat imprecise. The net length is then T = X - Y. This is of the form T = a X + b Y with a = 1 and b = -1 so E(T) = a E (X) + b E (Y) = 1 E(X) + (-1)E(Y) = E(X) - E(Y) V(T) = a 2 V(X) + (-b) 2 V(Y) = V(X) + V(Y) i.e. var(T) is greater than either var(X) or var(Y), even though T = X - Y, because both X and Y contribute to the variability in T.


Download ppt "Statistical Inference Most data comes in the form of numbers We have seen methods to describe and summarise patterns in data. Most data are samples (subsets)"

Similar presentations


Ads by Google