 # Inferential Statistics

## Presentation on theme: "Inferential Statistics"— Presentation transcript:

Inferential Statistics
Jin Guo

Inferential Statistics
Definition: the branch of statistics concerned with drawing conclusions about a population from a sample. Sample: representative, typically random Main functions: Estimating Population Parameters Testing statistically based hypotheses

Estimating Population Parameters
Estimating parameters related to central tendency (mean), variability (the standard deviation), and proportion (P). Example: Estimating a Population Mean The mean from infinite number of random samples from a normal distribution Mean: parameter (mean of the population) we are trying to estimate when unbiased. Standard Deviation: standard error of the mean In probability theory, the central limit theorem (CLT) states that, given certain conditions, the mean of a sufficiently large number of independent random variables, each with a well-defined mean and well-defined variance, will be approximately normally distributed.

Estimating Population Parameters
Point and Interval Estimate Point estimate: use a single value of a statistic to estimate the population parameter. Interval estimate: is defined by two numbers, between which a population parameter is said to lie. A confidence interval gives an estimated range of values which is likely to include an unknown population parameter, the estimated range being calculated from a given set of sample data.

Testing hypotheses Example:
Assume there are 1,000,000 on-line students in this course, I claim that 80 percent of them are very satisfied with today’s class.

Testing hypotheses Statistical hypotheses
Null hypothesis: sample observations result purely from chance. Alternative hypothesis Outcome: reject the null hypothesis or fail to reject the null hypothesis Decision Errors Type I error: reject a null hypothesis that is true. The probability of committing this error is called significance level. Type II error: fail to reject a null hypothesis that is false. The probability of not committing this error is called the power of the test.

Testing hypotheses Decision Rules:
P-value: the probability of observing a test statistic as extreme as S, assuming the null hypothesis is true. If the P- value is less than the significance level, we reject the null hypothesis. Region of acceptance: it is defined so that the chance of making a Type I error is equal to the significance level. If the test statistic falls within the region of acceptance, the null hypothesis is not rejected.

Testing hypotheses One-Tailed Test Two-Tailed Test
A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling distribution. For example, suppose the null hypothesis states that the mean is less than or equal to 10. The alternative hypothesis would be that the mean is greater than 10. The region of rejection would consist of a range of numbers located on the right side of sampling distribution; that is, a set of numbers greater than 10. Two-Tailed Test A test of a statistical hypothesis, where the region of rejection is on both sides of the sampling distribution. For example, suppose the null hypothesis states that the mean is equal to 10. The alternative hypothesis would be that the mean is less than 10 or greater than 10. The region of rejection would consist of a range of numbers located on both sides of sampling distribution; that is, the region of rejection would consist partly of numbers that were less than 10 and partly of numbers that were greater than 10.

Testing hypotheses Procedure:
State the hypotheses: include a null hypothesis and an alternative hypothesis, mutually exclusive. Formulate an analysis plan: Specify significance level and test method. Test method includes a test statistic (mean score, proportion, difference between means, difference between proportions, z-score, t-score, chi-square, etc) and a sampling distribution. Analyze sample data: Calculate the test statistic and P-value Interpret the results: Compare the P-value to the significance level, and rejecting the null hypothesis when the P-value is less than the significance level.

Testing hypotheses Test methods :
One-sample tests: a sample is being compared to the population from a hypothesis. Two-sample tests: comparing two samples, typically experimental and control samples from a scientifically controlled experiment. Paired tests: comparing two samples where members are paired between samples so the difference between the members becomes the sample. Chi-squared tests use the same calculations and the same probability distribution for different applications: Chi-squared tests for variance are used to determine whether a normal population has a specified variance. The null hypothesis is that it does. Chi-squared tests of independence are used for deciding whether two variables are associated or are independent. The variables are categorical rather than numeric. It can be used to decide whether left-handedness is correlated with libertarian politics (or not). The null hypothesis is that the variables are independent. The numbers used in the calculation are the observed and expected frequencies of occurrence (from contingency tables). Chi-squared goodness of fit tests are used to determine the adequacy of curves fit to data. The null hypothesis is that the curve fit is adequate. It is common to determine curve shapes to minimize the mean square error, so it is appropriate that the goodness-of-fit calculation sums the squared errors.

Testing hypotheses Test methods :
Z-tests: comparing means under stringent conditions regarding normality and a known standard deviation. T-tests: comparing means under relaxed conditions (less is assumed). F-tests (analysis of variance, ANOVA): comparing two variance. It is are commonly used when deciding whether groupings of data by category are meaningful. Chi-squared tests use the same calculations and the same probability distribution for different applications: chi-squared tests for variance, chi-squared tests of independence, chi-squared goodness of fit tests.

Testing hypotheses Purpose Test Method Means one sample t-test Difference between means two sample t-test Proportions one sample z-test Difference between proportions two-proportion z-test Regression Slope linear regression t-test Difference between matched pairs matched-pairs t-test Difference between variances two-sample f-test Goodness of fit chi-square goodness of fit test Homogeneity chi-square test for homogeneity Independence chi-square test for independence One-sample tests are appropriate when a sample is being compared to the population from a hypothesis. The population characteristics are known from theory or are calculated from the population. Two-sample tests are appropriate for comparing two samples, typically experimental and control samples from a scientifically controlled experiment. Paired tests are appropriate for comparing two samples where it is impossible to control important variables. Rather than comparing two sets, members are paired between samples so the difference between the members becomes the sample. Typically the mean of the differences is then compared to zero. Z-tests are appropriate for comparing means under stringent conditions regarding normality and a known standard deviation. T-tests are appropriate for comparing means under relaxed conditions (less is assumed). Tests of proportions are analogous to tests of means (the 50% proportion). Chi-squared tests use the same calculations and the same probability distribution for different applications: Chi-squared tests for variance are used to determine whether a normal population has a specified variance. The null hypothesis is that it does. Chi-squared tests of independence are used for deciding whether two variables are associated or are independent. The variables are categorical rather than numeric. It can be used to decide whether left-handedness is correlated with libertarian politics (or not). The null hypothesis is that the variables are independent. The numbers used in the calculation are the observed and expected frequencies of occurrence (from contingency tables). Chi-squared goodness of fit tests are used to determine the adequacy of curves fit to data. The null hypothesis is that the curve fit is adequate. It is common to determine curve shapes to minimize the mean square error, so it is appropriate that the goodness-of-fit calculation sums the squared errors. F-tests (analysis of variance, ANOVA) are commonly used when deciding whether groupings of data by category are meaningful. If the variance of test scores of the left-handed in a class is much smaller than the variance of the whole class, then it may be useful to study lefties as a group. The null hypothesis is that two variances are the same – so the proposed grouping is not meaningful. Reference:

Back to our example Assume there are 1,000,000 on-line students in this course, I claim that 80 percent of them are very satisfied with today’s class. To test this claim, I survey 100 students through , using simple random sampling. Among the sampled students, 73 percent say they are very satisfied. Based on these findings, can we reject the hypothesis that 80% of the students are very satisfied? Use a 0.05 level of significance.

Solution State null hypothesis and an alternative hypothesis.
Null hypothesis: P = 0.80 Alternative hypothesis: P ≠ 0.80 Formulate an analysis plan: significance level test method -- one-sample z-test (for testing proportions).

Solution Conditions for the test method:
The sampling method is simple random sampling. Each sample point can result in just two possible outcomes. We call one of these outcomes a success and the other, a failure. The sample includes at least 10 successes and 10 failures. (Some texts say that 5 successes and 5 failures are enough.) The population size is at least 10 times as big as the sample size.

Z-test Test statistic A z-score (standard score): indicates how many standard deviations an element is from the mean. It can be calculated from the formula: z = (X - μ) / σ where z is the z-score, X is the value of the element, μ is the population mean, and σ is the standard deviation. Interpret z-scores: the normal random variable of a standard normal distribution . A z-score equal to 0 represents an element equal to the mean. A z-score less than 0 represents an element less than the mean. A z-score greater than 0 represents an element greater than the mean. A z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc.

Solution Analyze sample data:
Using sample data, we calculate the standard deviation (σ) and compute the z-score test statistic (z). σ = sqrt[ P * ( 1 - P ) / n ] = sqrt [(0.8 * 0.2) / 100] = 0.04 z = (p - P) / σ = ( )/0.04 = -1.75 where P is the hypothesized value of population proportion in the null hypothesis, p is the sample proportion, and n is the sample size.

Solution Analyze sample data: Interpret results:
P-value: two-tailed test, the probability that the z-score is less than or greater than We use the Normal Distribution Calculator to find P(z < -1.75) = 0.04, and P(z > 1.75) = Thus, the P-value = = 0.08. Interpret results: Since the P-value (0.08) is greater than the significance level (0.05), we cannot reject the null hypothesis.