# Statistical inference Ian Jolliffe University of Aberdeen CLIPS module 3.4b.

## Presentation on theme: "Statistical inference Ian Jolliffe University of Aberdeen CLIPS module 3.4b."— Presentation transcript:

Statistical inference Ian Jolliffe University of Aberdeen CLIPS module 3.4b

Probability and Statistics Probability starts with a population, often described by a probability distribution, and predicts what will happen in a sample from that population. Probability starts with a population, often described by a probability distribution, and predicts what will happen in a sample from that population. Statistics starts with a sample of data, and describes the data informatively, or makes inferences about the population from which the sample was drawn. Statistics starts with a sample of data, and describes the data informatively, or makes inferences about the population from which the sample was drawn. This module concentrates on the inferential, rather than descriptive, side of Statistics. This module concentrates on the inferential, rather than descriptive, side of Statistics.

Probability vs. Statistics: an example Probability: suppose we know that the maximum temperature at a particular station has a Gaussian distribution with mean 27 o C and standard deviation 3 o C. We can use this (population) probability distribution to make predictions about the maximum temperature on one or more days. Probability: suppose we know that the maximum temperature at a particular station has a Gaussian distribution with mean 27 o C and standard deviation 3 o C. We can use this (population) probability distribution to make predictions about the maximum temperature on one or more days.

Probability vs. Statistics: an example II Statistics: for a new station we may wish to estimate the long-term (population) mean maximum temperature, and its standard deviation, based on the small amount of relevant data collected so far. Statistics: for a new station we may wish to estimate the long-term (population) mean maximum temperature, and its standard deviation, based on the small amount of relevant data collected so far.

Three types of statistical inference Point estimation: given a sample of relevant data find an estimate of the (population) mean maximum temperature in June at a station? Or an estimate of the (population) probability of precipitation in October at a station? Point estimation: given a sample of relevant data find an estimate of the (population) mean maximum temperature in June at a station? Or an estimate of the (population) probability of precipitation in October at a station? Interval estimation: given a sample of relevant data, find a range of values within which we have a high degree of confidence that the mean maximum temperature (or the probability of precipitation) lies Interval estimation: given a sample of relevant data, find a range of values within which we have a high degree of confidence that the mean maximum temperature (or the probability of precipitation) lies

Three types of statistical inference II Hypothesis testing: given a sample of relevant data, test one or more specific hypotheses about the population from which the sample is drawn. For example, is the mean maximum temperature in June higher at Station A than at Station B? Is the daily probability of precipitation at Station A in October larger or smaller than 0.5? Hypothesis testing: given a sample of relevant data, test one or more specific hypotheses about the population from which the sample is drawn. For example, is the mean maximum temperature in June higher at Station A than at Station B? Is the daily probability of precipitation at Station A in October larger or smaller than 0.5?

Representative samples In the previous two Slides we have used the phrase sample of relevant data In the previous two Slides we have used the phrase sample of relevant data It is crucial that the sample of data you use to make inferences about a population is representative of that population; otherwise the inferences will be be biased It is crucial that the sample of data you use to make inferences about a population is representative of that population; otherwise the inferences will be be biased Designing the best way of taking a sample is a third (as well as description and inference) aspect of Statistics. It is important, but it will not be discussed further in this module Designing the best way of taking a sample is a third (as well as description and inference) aspect of Statistics. It is important, but it will not be discussed further in this module

Estimates and errors The need for probability and statistics arises because nearly all measurements are subject to random variation The need for probability and statistics arises because nearly all measurements are subject to random variation One part of that random variation may be measurement error One part of that random variation may be measurement error By taking repeated measurements of the same quantity we can quantify the measurement error – By taking repeated measurements of the same quantity we can quantify the measurement error – build a probability distribution for it, then build a probability distribution for it, then use the distribution to find a confidence interval for the true value of the measurement use the distribution to find a confidence interval for the true value of the measurement

Confidence intervals A point estimate on its own is of little use A point estimate on its own is of little use For example, if I tell you the probability of precipitation tomorrow is 0.4, you will not know how to react. Your reaction will be different if I then say For example, if I tell you the probability of precipitation tomorrow is 0.4, you will not know how to react. Your reaction will be different if I then say the probability is 0.4 +/- 0.01 the probability is 0.4 +/- 0.01 the probability is 0.4 +/- 0.3 the probability is 0.4 +/- 0.3 We need some indication of the precision of an estimate, which leads naturally into confidence intervals We need some indication of the precision of an estimate, which leads naturally into confidence intervals

Confidence intervals II The two statements on the previous slides were of the form estimate +/- error The two statements on the previous slides were of the form estimate +/- error They could also be written as intervals They could also be written as intervals 0.4 +/- 0.01 ( 0.39, 0.41) 0.4 +/- 0.3 ( 0.10, 0.70) For these intervals to be confidence intervals we need to associate a level of confidence with them – for example we might say that we are 95% confident that the interval includes the true probability For these intervals to be confidence intervals we need to associate a level of confidence with them – for example we might say that we are 95% confident that the interval includes the true probability

Confidence intervals for what? Confidence intervals can be found for any parameter or parameters (a quantity describing some aspect of a population or probability distribution). Confidence intervals can be found for any parameter or parameters (a quantity describing some aspect of a population or probability distribution). Examples include Examples include probability of success in a binomial experiment probability of success in a binomial experiment mean of a normal distribution mean of a normal distribution mean of a Poisson distribution mean of a Poisson distribution

…for what II? More parameters More parameters Differences between normal means Differences between normal means Differences between binomial probabilities Differences between binomial probabilities Ratios of normal variances Ratios of normal variances Parameters describing Weibull, gamma or lognormal distributions Parameters describing Weibull, gamma or lognormal distributions

Calculation of confidence intervals No algebraic details are given, just the general principles behind most formulae for confidence intervals No algebraic details are given, just the general principles behind most formulae for confidence intervals Find an estimate of the parameter of interest Find an estimate of the parameter of interest The estimate is a function of the data, so is a random variable, and hence has a probability distribution The estimate is a function of the data, so is a random variable, and hence has a probability distribution Use this distribution to construct probability statements about the estimate Use this distribution to construct probability statements about the estimate Manipulate the probability statements to turn them into statements about confidence intervals and their coverage Manipulate the probability statements to turn them into statements about confidence intervals and their coverage

Calculation of confidence intervals – an outline example Suppose we have n independent observations on a Gaussian random variable (for example maximum daily temperature in June). We write U i ~ N( Suppose we have n independent observations on a Gaussian random variable (for example maximum daily temperature in June). We write U i ~ N( i = 1,2, …, n, where is the mean of the distribution and 2 is its variance The sample mean, or average, of the n observations is an obvious estimator for. We denote this average by Ū, and it can be shown that Ū ~ N( n)

Confidence interval calculations II Let Z = n( Let Z = n(Ū - μ)/σ. Then Z ~ N(0,1) We have tables of probabilities for N(0,1) and can use these to make statements such as P( -1.96 < Z < 1.96 ) = 0.95 P( -2.58 < Z < 2.58 ) = 0.99 P (-1.65 < Z < 1.65 ) = 0.90

Probabilities for N(0,1) – 90% interval

Probabilities for N(0,1) – 95% interval

Confidence interval calculations III Substituting the expression above for Z, in terms of Ū, μ, σ and n, into the probability statements, and manipulating those statements gives Substituting the expression above for Z, in terms of Ū, μ, σ and n, into the probability statements, and manipulating those statements gives P( Ū - 1.96σ/n < μ < Ū + 1.96σ/n ) = 0.95 P( Ū - 1.96σ/n < μ < Ū + 1.96σ/n ) = 0.95 P( Ū - 2.58σ/n < μ < Ū + 2.58σ/n ) = 0.99 P( Ū - 2.58σ/n < μ < Ū + 2.58σ/n ) = 0.99 P( Ū - 1.65σ/n < μ < Ū + 1.65σ/n ) = 0.90 P( Ū - 1.65σ/n < μ < Ū + 1.65σ/n ) = 0.90 The intervals defined by these 3 expressions are 95%, 99%, 90% confidence intervals respectively for μ The intervals defined by these 3 expressions are 95%, 99%, 90% confidence intervals respectively for μ

Confidence interval for μ -example Measurements are taken of maximum temperature at a station for a sample of 20 June days. Find a confidence interval for Measurements are taken of maximum temperature at a station for a sample of 20 June days. Find a confidence interval for μ, the (population) mean of maximum daily temperatures in June at that station. The data give Ū = 25.625, and we assume that σ = 1.5. Substituting these values in the expressions on the previous Slide gives 90% interval (25.07, 26.18) 95% interval (24.97, 26.28) 99% interval (24.76, 26.49)

Comments on confidence interval example Note how we pay for a greater degree of confidence with a wider interval Note how we pay for a greater degree of confidence with a wider interval We have deliberately chosen the simplest possible example of a confidence interval – though intervals for other parameters (see Slides 11, 12) have similar constructions, most are a bit more complicated in detail We have deliberately chosen the simplest possible example of a confidence interval – though intervals for other parameters (see Slides 11, 12) have similar constructions, most are a bit more complicated in detail One immediate complication is that we rarely know σ – it is replaced by an estimate, s, the sample standard deviation. This leads to an interval which is based on a so- called t-distribution rather than N(0,1), and which is usually wider. One immediate complication is that we rarely know σ – it is replaced by an estimate, s, the sample standard deviation. This leads to an interval which is based on a so- called t-distribution rather than N(0,1), and which is usually wider.

Confidence interval example – more comments As well as assuming σ known, the interval has also assumed normality (it needs to be checked whether this assumption is OK), and independence of observations, which wont be true if the data are recorded on consecutive days in the same month As well as assuming σ known, the interval has also assumed normality (it needs to be checked whether this assumption is OK), and independence of observations, which wont be true if the data are recorded on consecutive days in the same month

Interpretation of confidence intervals Interpretation is subtle/tricky and often found difficult or misunderstood Interpretation is subtle/tricky and often found difficult or misunderstood A confidence interval is not a probability statement about the chance of a (random) parameter falling in a (fixed) interval A confidence interval is not a probability statement about the chance of a (random) parameter falling in a (fixed) interval It is a statement about the chance that a random interval covers a fixed, but unknown, parameter It is a statement about the chance that a random interval covers a fixed, but unknown, parameter

Hypothesis testing -introduction For any parameter(s) where a confidence interval can be constructed, it may also be of interest to test a hypothesis. For any parameter(s) where a confidence interval can be constructed, it may also be of interest to test a hypothesis. For simplicity, we develop the ideas of hypothesis testing for the same scenario as our confidence interval example, namely inference for, a single Gaussian mean, when is known, though hypothesis testing is often more relevant when comparing two or more parameters For simplicity, we develop the ideas of hypothesis testing for the same scenario as our confidence interval example, namely inference for μ, a single Gaussian mean, when σ is known, though hypothesis testing is often more relevant when comparing two or more parameters

Hypothesis testing - example Suppose that the instrument or exposure for measuring temperature has changed. Over a long period with the old instrument/ exposure, the mean value of daily June maximum temperature was 25 o C, with standard deviation 1.5 o C Suppose that the instrument or exposure for measuring temperature has changed. Over a long period with the old instrument/ exposure, the mean value of daily June maximum temperature was 25 o C, with standard deviation 1.5 o C The data set described on Slide 19 comprises 20 daily measurements with the new instrument/ exposure. Is there any evidence of a change in mean? The data set described on Slide 19 comprises 20 daily measurements with the new instrument/ exposure. Is there any evidence of a change in mean?

The steps in hypothesis testing 1. Formulate a null hypothesis. The null hypothesis is often denoted as H 0. In our example we have H 0 : μ = 25 2. Define a test statistic, a quantity which can be computed from the data, and which will tend to take different values when H 0 is true/false. Ū is an obvious estimator of μ, which should vary as μ varies, and so is suitable as a test statistic. Z (see Slide 15) is equivalent to Ū, and is more convenient, as its distribution is tabulated. Thus use Z as our test statistic.

Hypothesis testing steps II 3. Calculate the value of Z for the data. Here Z = n( Here Z = n(Ū - μ)/σ, and Z ~ N(0,1). Assuming σ = 1.5, and the null value μ =25, we have Z = 20 (25.565 – 25) / 1.5 = 1.86 4. Calculate the probability of obtaining a value of the test statistic at least as extreme as that observed, if the null hypothesis is true. This probability is called a p-value. Here we calculate P( Z > 1.86) if we believe before seeing the data that the change in mean temperature could only be an increase, or P(Z > 1.86) + P(Z < -1.86), if we believe the change could be in either direction.

Hypothesis testing steps III The p-values are calculated from tables of the cumulative distribution of N(0,1). From such tables, P( Z > 1.86 ) = 0.03, and because and symmetry of N(0,1) about zero, P ( Z 1.86 ) = 0.03, and because and symmetry of N(0,1) about zero, P ( Z < -1.86 ) = 0.03, and the two-sided p-value is 0.06 Whether we calculate a one- or two-sided p-value depends on the context. If felt certain, before collecting the data, that the change in instrument/exposure could only increase mean temperature, a one-sided p-value is appropriate. Otherwise we should use the two-sided (two-tailed) value Whether we calculate a one- or two-sided p-value depends on the context. If felt certain, before collecting the data, that the change in instrument/exposure could only increase mean temperature, a one-sided p-value is appropriate. Otherwise we should use the two-sided (two-tailed) value

Confidence intervals and hypothesis testing There is often, though not always, an equivalence between confidence intervals and hypothesis testing There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence interval A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence interval

Hypothesis testing: p-values We havent yet said what to do with our calculated p-value We havent yet said what to do with our calculated p-value A small p-value casts doubt on the plausibility of the null hypothesis, H 0, but … A small p-value casts doubt on the plausibility of the null hypothesis, H 0, but … It is NOT equal to P(H 0 |data) It is NOT equal to P(H 0 |data) How small is small? Frequently a single threshold is set, often 0.05, 0.01 or 0.10 (5%, 1%, 10%), and H 0 is rejected when the p-value falls below the threshold (rejected at the 5%, 1%, 10% significance level). How small is small? Frequently a single threshold is set, often 0.05, 0.01 or 0.10 (5%, 1%, 10%), and H 0 is rejected when the p-value falls below the threshold (rejected at the 5%, 1%, 10% significance level). It is more informative to quote a p-value than to conduct the test a single threshold level It is more informative to quote a p-value than to conduct the test a single threshold level

Confidence intervals and hypothesis testing There is often, though not always, an equivalence between confidence intervals and hypothesis testing There is often, though not always, an equivalence between confidence intervals and hypothesis testing A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence intervals A null hypothesis is rejected at the 5% (1%) if and only if the null value of the parameter lies outside a 95% (99%) confidence intervals

Download ppt "Statistical inference Ian Jolliffe University of Aberdeen CLIPS module 3.4b."

Similar presentations