CHAPTER Discrete Models G eneral distributions C lassical: Binomial, Poisson, etc Continuous Models G eneral distributions C lassical: Normal, etc.
X 2 ~ The Normal Distribution ~ (a.k.a. “The Bell Curve”) Johann Carl Friedrich Gauss μ σ mean standard deviation X ~ N( μ, σ ) Symmetric, unimodal Models many (but not all) natural systems Mathematical properties make it useful to work with
Standard Normal Distribution Z ~ Z ~ N(0, 1) Z The cumulative distribution function (cdf) is denoted by ( z ). It is tabulated, and computable in R via the command pnorm. SPECIAL CASE Total Area = 1 1
Z 1 Standard Normal Distribution Z ~ Z ~ N(0, 1) Example Find P(Z 1.2). 1.2 “z-score” Total Area = 1
Z Standard Normal Distribution Z ~ Z ~ N(0, 1) Example Find P(Z 1.2) Use the included table. “z-score” Total Area = 1
6 Lecture Notes Appendix…
7
Z Standard Normal Distribution Z ~ Z ~ N(0, 1) Example Find P(Z 1.2) Use the included table Use R: > pnorm(1.2) [1] “z-score” P(Z > 1.2) Total Area = 1 Note: Because this is a continuous distribution, P(Z = 1.2) = 0, so there is no difference between P(Z > 1.2) and P(Z 1.2), etc.
Standard Normal Distribution Z ~ Z ~ N(0, 1) Z μ σ X ~ N( μ, σ ) 1 Any normal distribution can be transformed to the standard normal distribution via a simple change of variable. Why be concerned about this, when most “bell curves” don’t have mean = 0, and standard deviation = 1?
Year 2010 X ~ N(25.4, 1.5) μ = 25.4 σ = Example Random Variable X = Age at first birth POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? 27.2
11 Example Random Variable X = Age at first birth POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? Year 2010 X ~ N(25.4, 1.5) σ = 1.5 μ =μ = 33 The x-score = 27.2 must first be transformed to a corresponding z-score. μ =
12 Example Random Variable X = Age at first birth POPULATION Question: What proportion of the population had their first child before the age of 27.2 years old? P(X < 27.2) = ? σ = 1.5 μ =μ = 33 P(Z < 1.2) = Using R: > pnorm(27.2, 25.4, 1.5) [1] Year 2010 X ~ N(25.4, 1.5) μ =
Z What symmetric interval about the mean 0 contains 95% of the population values? That is… 1 Standard Normal Distribution Z ~ Z ~ N(0, 1)
Z z.025 = ? -z.025 = ? What symmetric interval about the mean 0 contains 95% of the population values? That is… Standard Normal Distribution Z ~ Z ~ N(0, 1) Use the included table.
15 Lecture Notes Appendix…
16
Use the included table. +z.025 = ?+z.025 = z.025 = ? Standard Normal Distribution Z ~ Z ~ N(0, 1) Z What symmetric interval about the mean 0 contains 95% of the population values? -z.025 = “.025 critical values” Use R: > qnorm(.025) [1] > qnorm(.975) [1]
+z.025 = ?+z.025 = z.025 = ? Standard Normal Distribution Z ~ Z ~ N(0, 1) Z What symmetric interval about the mean 0 contains 95% of the population values? -z.025 = “.025 critical values” What symmetric interval about the mean age of 25.4 contains 95% of the population values? X yrs X ~ N( μ, σ ) X ~ N(25.4, 1.5) > areas = c(.025,.975) > qnorm(areas, 25.4, 1.5) [1]
Use the included table. Standard Normal Distribution Z ~ Z ~ N(0, 1) Z z.05 = ?-z.05 = ? What symmetric interval about the mean 0 contains 90% of the population values? Similarly…
20 …so average 1.64 and average of and …
Use the included table. -z.05 = ?-z.05 = Standard Normal Distribution Z ~ Z ~ N(0, 1) Z z.05 = ? What symmetric interval about the mean 0 contains 90% of the population values? Similarly… +z.05 = “.05 critical values” Use R: > qnorm(.05) [1] > qnorm(.95) [1]
-z.05 = ?-z.05 = Standard Normal Distribution Z ~ Z ~ N(0, 1) Z z.05 = ? What symmetric interval about the mean 0 contains 100(1 – )% of the population values? Similarly… +z.05 = “.05 critical values” In general…. 1 – / 2 -z / 2 +z / 2 “ / 2 critical values”
23 continuousdiscrete Suppose a certain outcome exists in a population, with constant probability . P(Success) = P(Failure) = 1 – We will randomly select a random sample of n individuals, so that the binary “Success vs. Failure” outcome of any individual is independent of the binary outcome of any other individual, i.e., n Bernoulli trials (e.g., coin tosses). Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) Discrete random variable X = # Successes in sample (0, 1, 2, 3, …,, n) Then X is said to follow a Binomial distribution, written X ~ Bin(n, ), with “probability function” f(x) =, x = 0, 1, 2, …, n.
24 > dbinom(10, 100,.2) [1] Area
25 > pbinom(10, 100,.2) [1] Area
26
27
28
29
30 Therefore, if… X ~ Bin( n, ) with n 15 and n (1 – ) 15, then… Therefore, if… X ~ Bin( n, ) with n 15 and n (1 – ) 15, then… That is… “Sampling Distribution” of
31 ● Normal distribution ● Log-Normal ~ X is not normally distributed (e.g., skewed), but Y = “logarithm of X” is normally distributed ● Student’s t-distribution ~ Similar to normal distr, more flexible ● F-distribution ~ Used when comparing multiple group means ● Chi-squared distribution ~ Used extensively in categorical data analysis ● Others for specialized applications ~ Gamma, Beta, Weibull…