Lecture 5 Outline: Thu, Sept 18 Announcement: No office hours on Tuesday, Sept. 23rd after class. Extra office hour: Tuesday, Sept. 23rd from 12-1 p.m.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Lecture 6 Outline – Thur. Jan. 29
Inference Sampling distributions Hypothesis testing.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Elementary hypothesis testing
Lecture 23: Tues., Dec. 2 Today: Thursday:
Class 8: Tues., Oct. 5 Causation, Lurking Variables in Regression (Ch. 2.4, 2.5) Inference for Simple Linear Regression (Ch. 10.1) Where we’re headed:
Lecture 10 Outline: Tue, Oct 7 Resistance of two sample t-tools (Chapter 3.3) Practical strategies for two-sample problem (Chapter 3.4) Review Office hours:
Introduction to Inference Estimating with Confidence Chapter 6.1.
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Lecture 7 Outline – Thur, Sep 25
Lecture 6 Outline: Tue, Sept 23 Review chapter 2.2 –Confidence Intervals Chapter 2.3 –Case Study –Two sample t-test –Confidence Intervals Testing.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Today Today: Chapter 10 Sections from Chapter 10: Recommended Questions: 10.1, 10.2, 10-8, 10-10, 10.17,
Stat 112 – Notes 3 Homework 1 is due at the beginning of class next Thursday.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
Lecture 4 Outline: Tue, Sept 16 Chapter 1.4.2, Chapter 1.5, additional material on sampling units and meaningful comparisons –Review of probability models.
IENG 486 Statistical Quality & Process Control
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
5-3 Inference on the Means of Two Populations, Variances Unknown
Inference on averages Data are collected to learn about certain numerical characteristics of a process or phenomenon that in most cases are unknown. Example:
Hypothesis Testing Using The One-Sample t-Test
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Inference for regression - Simple linear regression
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
More About Significance Tests
Week 8 Chapter 8 - Hypothesis Testing I: The One-Sample Case.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Essential Statistics Chapter 131 Introduction to Inference.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
The Practice of Statistics Third Edition Chapter 10: Estimating with Confidence Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
7. Comparing Two Groups Goal: Use CI and/or significance test to compare means (quantitative variable) proportions (categorical variable) Group 1 Group.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Confidence intervals and hypothesis testing Petter Mostad
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 9 1 MER301:Engineering Reliability LECTURE 9: Chapter 4: Decision Making for a Single.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
BPS - 3rd Ed. Chapter 141 Tests of significance: the basics.
Week111 The t distribution Suppose that a SRS of size n is drawn from a N(μ, σ) population. Then the one sample t statistic has a t distribution with n.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.
Sampling and Statistical Analysis for Decision Making A. A. Elimam College of Business San Francisco State University.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Probability and Statistics Confidence Intervals.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.
Chapter 7 Inference Concerning Populations (Numeric Responses)
Lecture 22 Dustin Lueker.  Similar to testing one proportion  Hypotheses are set up like two sample mean test ◦ H 0 :p 1 -p 2 =0  Same as H 0 : p 1.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Essential Statistics Chapter 191 Comparing Two Proportions.
Statistical hypothesis Statistical hypothesis is a method for testing a claim or hypothesis about a parameter in a papulation The statement H 0 is called.
16/23/2016Inference about µ1 Chapter 17 Inference about a Population Mean.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Hypothesis Testing: Hypotheses
Chapter 9 Hypothesis Testing.
Essential Statistics Introduction to Inference
Lecture 7 Sampling and Sampling Distributions
Basic Practice of Statistics - 3rd Edition Introduction to Inference
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Presentation transcript:

Lecture 5 Outline: Thu, Sept 18 Announcement: No office hours on Tuesday, Sept. 23rd after class. Extra office hour: Tuesday, Sept. 23rd from 12-1 p.m. Chapter (additional material on sampling units), 2.1.2, 2.2 –Sampling frame and sampling units –Paired t-test Sampling distribution of sample average t-ratio and t-test Confidence intervals

Notes on Box Plots Dotted lines extend to the largest (and smallest) points in data that are within 1.5 IQRs of the third (first) quartile. All other points are marked by dots. The red bracket on the side of the box plot shows the shortest half of data (shortest interval containing half the data). The shortest half is at the center for symmetric distributions, but off- center for non-symmetric ones.

Simple Random Sample A simple random sample (of size n) is a subset of a population obtained by a procedure giving all sets of n distinct items in the population an equal chance of being chosen. Need a frame: a numbered list of all subjects. Simple random sample: Generate random number for each subject. Choose subjects with n smallest numbers. Simple random sample in JMP: –Click on Tables, Subset, then put the number n in the box “Sampling Rate or Sample Size.”

Sampling units In conducting a random sample, it is important that we are randomly sampling the units of interest. Otherwise we may create a selection bias. Sampling families –If we want mean number of children per family, we should either Sample by family Sample by person but downweight kids from large families. –Suppose we want to know mean level of radiation in community and have available a frame of housing lots in the community. We need to use variable probability sampling, giving a larger probability of being sampled to larger lots.

The clinician’s illusion For several diseases such as schizophrenia, alcoholism and opiate addiction, clinicians think that the long-term prognosis is much worse than do researchers. Part of disagreement may arise from differences in the population they sample –Clinicians: “Prevalence” sample – sample from population currently suffering disease which contains a disproportionate number of people suffering disease for long time –Researchers: “Incidence” sample – sample from population who has ever contracted the disease. –Reference: P. Cohen, J. Cohen, Archives of General Psychiatry, 1984.

Case Study Broad Question: Are any physiological indicators associated with schizophrenia? Early studies suggested certain areas of brain may be different in persons with schizophrenia than in others but confounding factors clouded the issue. Specific Question: Is the left hippocampus region of brain smaller in people with schizophrenia? Research design: Sample pairs of monozygotic twins, where one of twins was schizophrenic and other was not. Comparing monozy. twins controls for genetic and socioeconomic differences.

Case Study Cont. The mean difference (unaffected-affected) in volume of left hippocampus region between 15 pairs is Is this larger than could be explained by “chance”? Probability (chance) model: Random sampling (fictitious) from a single population. Scope of inference –Goal is to make inference about population mean but inference to larger population is questionable because we did not take a random sample. –No causal inference can be made. In fact researchers had no theories about whether abnormalities preceded the disease or resulted from it.

Probability Model Goal is to compare two groups (affecteds and unaffecteds) but we have taken a paired sample. We can think of having one population (pairs of twins) and looking at the mean of one variable, the difference in hippocampus volumes in each pair. Probability model: Simple random sample with replacement from population. For a large population, this is essentially equivalent to a simple random sample without replacement.

Parameters and Statistics Population parameters ( ) – = population mean – = population variance = average size of in population Hypotheses: Sample statistics ( ) –Sample: – = sample mean – = sample variance

Sampling distribution of sample mean See Displays 2.3 and 2.4 Standard deviation of : Standard error of : – –Estimated standard deviation of the sampling distribution of –For schizophrenia study,

Test Statistics Z-ratio –For a general parameter: –For 1-group: t-ratio –For a general parameter: –For 1-group:

Distribution of test statistics Facts from statistical theory: If* the population distribution of Y is normal, then the sampling distribution of –(i) the z-ratio is standard normal –(ii) the t-ratio is student’s t on n-1 degrees of freedom –* = We will study the “if” part later; for now we will assume it is true See Display 2.5

Testing a hypothesis about Could the difference of from (the hypothesized value for, =0 here ) be due to chance (in random sampling)? Test statistic: If H 0 is true, then t equals the t-ratio and has the Student’s t-distribution with n-1 degrees of freedom

P-value The (2-sided) p-value is the proportion of random samples with absolute value of t ratios >= observed test statistic (|t|) Schizophrenia example: t = 3.23

Schizophrenia Example p-value (2-sided, paired t-test) =.006 So either, –(i) the null hypothesis is incorrect OR –(ii) the null hypothesis is correct and we happened to get a particularly unusual sample (only 6 out of 1000 are as unusual) Strong evidence against One-sided test: –Test statistic: –For schizophrenia example, t=3.21, p-value (1-sided) =.003

Matched pairs t-test in JMP Click Analyze, Matched Pairs, put two columns (e.g., affected and unaffected) into Y, Paired Response. Can also use one-sample t-test. Click Analyze, Distribution, put difference into Y, columns. Then click red triangle under difference and click test mean.

Confidence Interval for A confidence interval is a range of “plausible values” for a statistical parameter (e.g., the population mean) based on the data. It conveys the precision of the sample mean as an estimate of the population mean. A confidence interval typically takes the form: point estimate margin of error The margin of error depends on two factors: –Standard error of the estimate –Degree of “confidence” we want.

CI for population mean If the population distribution of Y is normal (* we will study the if part later) 95% CI for mean of single population: For schizophrenia data:

Interpretation of CIs A 95% confidence interval will contain the true parameter (e.g., the population mean) 95% of the time if repeated random samples are taken. It is impossible to say whether it is successful or not in any particular case, i.e., we know that the CI will usually contain the true mean under random sampling but we do not know for the schizophrenia data if the CI (0.067cm 3,0.331cm 3 ) contains the true mean difference.

Confidence Intervals in JMP For both methods of doing paired t-test (Analyze, Matched Pairs or Analyze, Distribution), the 95% confidence intervals for the mean are shown on the output.