Presentation on theme: "Chapter 5: Sampling Distributions, Other Distributions, Continuous Random Variables http://www.socialresearchmethods.net/kb/sampstat.php."— Presentation transcript:
1 Chapter 5: Sampling Distributions, Other Distributions, Continuous Random Variables
2 5.2: Binomial and Poisson Distributions - Goals Determine when the random variable (count) X can be modeled using the binomial or Poisson Distributions.Calculate the probability, mean and standard deviation when X has a binomial or Poisson distribution.Determine when you can use the normal approximation to the binomial and perform calculations using this approximation.
3 Binomial Setting - BINS Binary: There are only two possible outcomes for each trial.Independent: Trials must be independent of each other.Number: The number of trials n of the chance process must be fixed.Success: On each trial, the probability p of success must be the same.
4 Binomial Setting: Example Do the following use the Binomial Setting?Rolling a fair 4-sided die five times and observing whether the number showing is a 1 or notIn a drug trial, 20 patients with the same condition are given a drug and some are given a placebo to see if the drug is effective or not.In quality control we want to see if a particular product is ‘not acceptable’. We take 20 random samples from an assembly line that uses different machines to produce the product.
5 Binomial Distribution The count X of successes in a binomial setting has the binomial distribution with parameters n and p, where n is the number of trials of the chance process and p is the probability of a success on any one trial. The possible values of X are the whole numbers from 0 to n.X ~ B(n,p)
6 Examples of Binomial Distribution In a clinical trial, a patient’s condition may improve or not. We study the number of patients who improved.Was a sales transaction considered pleasant? The binomial distribution describes the number of pleasant transactions.In quality control we assess the number of defective items in a lot of goods.
7 Binomial Probabilities If X has the binomial distribution with n trials and probability p of success on each trial, the possible values of X are 0, 1, 2, …, n. If k is any one of these values, 𝑃 𝑋=𝑘 = 𝑛 𝑘 𝑝 𝑘 (1−𝑝) 𝑛−𝑘 𝑛 𝑘 = 𝑛! 𝑘! 𝑛−𝑘 !
8 Example: Binomial Distribution Suppose 20% of all copies of a particular textbook fail a certain binding strength test. Let's check a batch of 15 such textbooks.Is this a binomial distribution?What is the chance that we get no defective textbooks?What is the chance that we get less than 3 defective textbooks?What is the chance that we get more than 2 defective textbooks?
10 Histograms of Binomial Distributions p = 0.5n = 10p = 0.25n = 10p = 0.75
11 Binomial Distribution: Mean and Standard Deviation If X ~ B(n,p) then E(X) = X = np 𝜎 𝑋 = 𝑛𝑝(1−𝑝)
12 Example: Binomial Distribution (cont) Suppose 20% of all copies of a particular textbook fail a certain binding strength test. Let's check a batch of 15 such textbooks.What are the mean and standard deviation of the number of textbooks that will fail the binding test?
13 Difficulties with the Normal Approximation to the Binomial Skewedness of the Binomial Distribution.The Binomial Distribution is discrete.
15 Continuity Correction – Extra Actual ValueApproximate ValueP(X = a)P(a – 0.5 < X < a +0.5)P(a < X)P(a < X)P(a ≤ X)P(a – 0.5 < X)P(X < b)P(X < b – 0.5)P(X ≤ b)P(X < b + 0.5)
16 Example: Normal Approximation to the Binomial The ideal size of a first-year class at a particular college is 150 students. The college, knowing from past experience that on the average only 30 percent of those accepted for admission will actually attend, uses a policy of approving the applications of 450 students. Compute the probability that more than 150 students attend this college.
17 Poisson DistributionThe number of times that an event occurs during a particular time period or in a particular areaExample:The number of people who enter the Union from noon to 1 pm.The number of α-particles emitted from Uranium-238 in 1 minute.The number of DNA fragments found from a sequencing experiment.The number of dead trees in a square mile of forest.
18 Poisson SettingThe number of successes that occur in two nonoverlapping units of measure are independent.The probability that a success will occur in a unit of measure is the same for all units of equal size and is proportional to the size of the unit.The probability that more than one event occurs in a unit of measure is negligible for very small-sized units.
20 Example: Poisson Distribution An IT consultant receives an average of 3 calls per hour. Let X be the number of calls the consultant receives. X follows a Poisson distribution. a) What is the chance that the consultant receives exactly one call during the next hour? b) What is the chance that the consultant receives more than one call during the next hour? c) What is the chance that the consultant receives exactly 5 calls during the next two hours?
21 Example: Poisson Approximation to Binomial 0.2% of feral cats are infected with feline aids (FIV) in a region. What is the chance that there are exactly 10 cats infected with FIV among 1000 cats?
22 5.3: Continuous Random Variables Uniform and Exponential Distributions - Goals Describe the probability distribution of a continuous random variable.Use the distribution of a continuous random variable to calculate probabilities and percentiles (median) of events.Be able to use a probability distribution to find the mean of a continuous random variable.Be able to use a probability distribution to find the variance of a continuous random variable.Calculate the probability, mean and standard deviation when X has a Uniform or Exponential distribution.
23 Continuous Random Variable A continuous random variable X takes all values in an interval of numbers or collection of such intervals.y = f(x)
24 Continuous Random Variable A continuous probability model assigns probabilities as areas under a density curve.
25 Density Curves – Percentiles 𝑝= −∞ 𝑦 𝑓 𝑥 𝑑𝑥The median of a density curve is the equal – areas point.𝑝=0.5= −∞ 𝜇 𝑓 𝑥 𝑑𝑥
26 Example: Continuous Random Variable The distribution of the grade of a particular road in a particular 2 mile region is a continuous r.v. X with density𝑓 𝑥 = 𝑥 0≤𝑥≤2 0 𝑒𝑙𝑠𝑒Is this a valid density curve?What is the probability that the grade is in the last quarter mile of the region?What is the median of this distribution?
27 Example: Continuous Random Variable We know that the distribution of the grade of a particular road in a particular 2 mile region is a continuous r.v. X with a functional form which is proportional to x2. What is f(x)?
28 Formulas for the Mean of a Random Variable Discrete – Mean Discrete – Rule 3𝐸 𝑋 = 𝜇 𝑋 = 𝑖 𝑥 𝑖 𝑝 𝑖 𝐸 𝑔 𝑋 = 𝑔( 𝑥 𝑖 ) 𝑝 𝑖Continuous Continuous – Rule 3𝐸 𝑋 = 𝜇 𝑋 = −∞ ∞ 𝑥𝑓 𝑥 𝑑𝑥𝐸(𝑔(𝑋))= −∞ ∞ 𝑔(𝑥)𝑓(𝑥)𝑑𝑥
29 Variance of a Random Variable Var(X)=E X− 𝜇 𝑋 2 = ( 𝑥 𝑖 − X ) 2 ∙ 𝑝 𝑖 = −∞ ∞ (𝑥− X ) 2 𝑓(𝑥)𝑑𝑥 = E(X2) – (E(X))2 𝜎 𝑋 = 𝑉𝑎𝑟(𝑋)
30 Example: Continuous Random Variables For the following density function:What is the expected value?Calculate E(X2).Calculate the standard deviation.
32 Uniform DistributionThe density function of the uniform distribution over the interval [a,b] is 𝑓 𝑥 = 1 𝑏−𝑎 𝑎<𝑥<𝑏 0 𝑒𝑙𝑠𝑒 𝐸 𝑋 = 𝑎+𝑏 2 𝜎 𝑋 = 𝑏−𝑎 12
33 Example: UniformA packaging line constantly packages 200 cartons per hour. After weighing every package variation the distribution of the weights was found to be uniform with weights ranging from 18.2 lbs. – 20.4 lbs., measured to the nearest tenths. The customer requires less than 20.0 lbs. for ergonomic reasons.What is the probability that the package weights less than 20 lbs.?What are the mean and the standard deviation of the package weights?
34 Exponential Distribution Uses: amount of time until some specific event occurs (the amount of time between successive events)𝑓 𝑥 = 𝜆 𝑒 −𝜆𝑥 𝑥≥0 0 𝑒𝑙𝑠𝑒𝐸 𝑋 = 1 𝜆𝜎 𝑋 = 1 𝜆
35 Example: ExponentialThe life span of some bacteria (in hours) has an exponential distribution with an average life span of 0.5 hours.What is the proportion of bacteria that live at most 1 hour?What is the proportion of bacterial that live more than 1.5 hours?What is the standard deviation of the distribution of these bacteria?
36 Gamma Distribution Generalization of the exponential function Uses probability theorytheoretical statisticsactuarial scienceoperations researchengineering
37 Beta Distribution This distribution is only defined on an interval standard beta is on the interval [0,1]usesmodeling proportionspercentagesProbabilitiesUniform distribution is a member of this family.
38 Other Continuous Random Variables Weibullexponential is a member of familyuses: lifetimeslognormallog of the normal distributionuses: products of distributionsCauchysymmetrical, long straggly tails
39 5.1: Sampling Distribution of a Sample Mean - Goals Explain the difference between the sampling distribution of x̄ and the population distribution of .Determine the mean and standard deviation of x̄ for an SRS of size n from a population with mean and standard deviation .Use the central limit theorem (CLT) to approximate the shape of the sampling distribution of x̄ and use it to perform probability calculations.
40 Statistical Inference Parameter: number describing a characteristic of the population.Statistics: number describing a characteristic of the sample.PopulationSample?
41 Sampling Distributions The law of large numbers assures us that if we measure enough subjects, the statistic x̄ will eventually get very close to the unknown parameter µ.The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.The population distribution of a variable is the distribution of values of the variable among all individuals in the population.
42 Spread as a function of n Therefore, sample means are less variable thanindividual observations
43 Example: mean and SD of Sampling Distribution The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze has a normal distribution with μ = 1.5 min and σ = 0.35 min. Suppose five rats are randomly selected.What is the mean of the average time?What is the standard deviation of the average time?
44 Shape of Sampling Distributions If a population X ~ N(, σ) then the sample distribution of X̄ ~ N 𝜇, 𝜎 𝑛 .Draw a SRS of size n from any population with mean and finite standard deviation σ. When n is large, the sample distribution of the sample mean X̄ is approximately normal with N 𝜇, 𝜎 𝑛 .
45 Example – Sampling Distribution: Normal The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze has a normal distribution with μ = 1.5 min and σ = 0.35 min. Suppose five rats are randomly selected.What is the probability that the average time is at most 2.0 minutes?What is the probability that the average time will be within 0.3 minutes of the mean?
46 Shape of Sampling Distributions If a population X ~ N(, σ) then the sample distribution of X̄ ~ N 𝜇, 𝜎 𝑛 .Draw a SRS of size n from any population with mean and finite standard deviation σ. When n is large, the sample distribution of the sample mean X̄ is approximately normal with N 𝜇, 𝜎 𝑛 .
47 A Few More FactsAny linear combination of independent Normal random variables is also Normal.More generally, the distribution of a sum or average of many small random quantities is close to Normal whether independent or not.CLT also applies to discrete random variables.
49 CLT: Example 1 (in class) An electronics company manufactures resistors that have a mean resistance of 100 ohms and a standard deviation of 10 ohms. Assume that the distribution of resistance is normal. a) Find the probability that one resistor will have a resistance less than 95 ohms. (0.3085) b) Find the probability that a random sample of 25 resistors will have an average resistance less than 95 ohms. (0.0062)X
50 CLT: Example 2 (in class) Without checking the city bus web site, a student walks at random times to the Beering Hall bus stop to wait for the Ross Ade bus which is supposed to arrive every 10 minutes. This will be a Uniform distribution with 0 ≤ x ≤ 10. For a Uniform distribution on the interval (a,b), a) If one student walks to the bus stop to catch this bus, what is the probability that the wait time will be more than 6 minutes? (0.4) b) If 40 students walk to the bus stop to catch this bus, what is the probability that the average wait time will be more than 6 minutes? (0.0143)X