Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study 2.1.2 Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)

Slides:



Advertisements
Similar presentations
Chapter 6 Sampling and Sampling Distributions
Advertisements

Lecture 6 Outline – Thur. Jan. 29
Estimation in Sampling
Inference Sampling distributions Hypothesis testing.
Statistics and Quantitative Analysis U4320
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Introduction to Statistics
Objectives (BPS chapter 24)
Significance Testing Chapter 13 Victor Katch Kinesiology.
Chapter 7 Sampling and Sampling Distributions
Lecture 7 Outline – Thur, Sep 25
Lecture 6 Outline: Tue, Sept 23 Review chapter 2.2 –Confidence Intervals Chapter 2.3 –Case Study –Two sample t-test –Confidence Intervals Testing.
Lecture 5 Outline: Thu, Sept 18 Announcement: No office hours on Tuesday, Sept. 23rd after class. Extra office hour: Tuesday, Sept. 23rd from 12-1 p.m.
PSY 1950 Confidence and Power December, Requisite Quote “The picturing of data allows us to be sensitive not only to the multiple hypotheses that.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Inference about a Mean Part II
Lecture 9 Today: –Log transformation: interpretation for population inference (3.5) –Rank sum test (4.2) –Wilcoxon signed-rank test (4.4.2) Thursday: –Welch’s.
IENG 486 Statistical Quality & Process Control
Chapter 11: Inference for Distributions
Chapter 9 Hypothesis Testing.
Inference on averages Data are collected to learn about certain numerical characteristics of a process or phenomenon that in most cases are unknown. Example:
Hypothesis Testing Using The One-Sample t-Test
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 9 Comparing Means
Statistical Inference for Two Samples
Inference for regression - Simple linear regression
Jeopardy Hypothesis Testing T-test Basics T for Indep. Samples Z-scores Probability $100 $200$200 $300 $500 $400 $300 $400 $300 $400 $500 $400.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Estimation and Confidence Intervals
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 2 – Slide 1 of 25 Chapter 11 Section 2 Inference about Two Means: Independent.
More About Significance Tests
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
CHAPTER 18: Inference about a Population Mean
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Agresti/Franklin Statistics, 1 of 122 Chapter 8 Statistical inference: Significance Tests About Hypotheses Learn …. To use an inferential method called.
Inference We want to know how often students in a medium-size college go to the mall in a given year. We interview an SRS of n = 10. If we interviewed.
1 Section 9-4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Confidence intervals and hypothesis testing Petter Mostad
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Confidence Intervals Lecture 3. Confidence Intervals for the Population Mean (or percentage) For studies with large samples, “approximately 95% of the.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 8 Hypothesis Testing.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Statistical Inference for the Mean Objectives: (Chapter 9, DeCoursey) -To understand the terms: Null Hypothesis, Rejection Region, and Type I and II errors.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Chapter 10 The t Test for Two Independent Samples
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
1 Probability and Statistics Confidence Intervals.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9 Introduction to the t Statistic
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Presentation transcript:

Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1) –Sampling distribution of sample mean –t-test –Confidence intervals

Miscellanea from Lecture 4 Definition of medians/quartiles –The median is the midpoint of a distribution, the number such that half the observations are smaller and half are larger. Computing the median: Make a list of all observations. If n is odd, the median is the center of the ordered observations; if n is even, the median is the mean of the two center observations. –pth percentile of a distribution: value such that p percent of the observations fall at or below it. Exact computation in JMP: order the observations from top to bottom and count up required percent of observations from the bottom of the list. If pth percent falls between two observations, JMP takes weighted average, (1- p)*lower observation + p*higher observation.

Miscellanea from Lecture 4 Definition of median/quartiles continued: –First quartile is 25 th percentile. Third quartile is 75 th percentile. Long-tailed vs. short-tailed distributions: Loosely, a long-tailed distribution has a tail that dies out slower than the normal distribution. A short-tailed distribution has a tail that dies out faster than the normal distribution. See figure at end of notes.

Case Study Broad Question: Are any physiological indicators associated with schizophrenia? Early studies suggested certain areas of brain may be different in persons with schizophrenia than in others but confounding factors clouded the issue. Specific Question: Is the left hippocampus region of brain smaller in people with schizophrenia? Research design: Sample pairs of monozygotic twins, where one of twins was schizophrenic and other was not. Comparing monozy. twins controls for genetic and socioeconomic differences.

Case Study Cont. The mean difference (unaffected-affected) in volume of left hippocampus region between 15 pairs is Is this larger than could be explained by “chance”? Probability (chance) model: Random sampling (fictitious) from a single population. Scope of inference –Goal is to make inference about population mean but inference is questionable because we did not take a random sample. –No causal inference can be made. In fact researchers had no theories about whether abnormalities preceded the disease or resulted from it.

Probability Model Goal is to compare two groups (affecteds and unaffecteds) but we have taken a paired sample. We can think of having one population (pairs of twins) and looking at the mean of one variable, the difference in hippocampus volumes in each pair. Probability model: Simple random sample with replacement from population. When the population size is more than 50 times the sample size, simple random sampling with replacement and simple random sampling without replacement are essentially equivalent.

Review of Terminology Population: Collection of all items of interest to researcher, e.g., heights of members of this class, U.S. adults’ incomes, lifetimes of a new brand of tires. Statistic (random variable): Any quantity that can be calculated from the data. Probability (sampling) distribution of statistic: the proportion of times that a statistic will take on each possible value in repeated trials of the data collection process (randomized experiment or random sample). Population distribution: The probability distribution of a randomly chosen observation from the population. Parameter: Describes feature of population distribution (e.g., mean or standard deviation)

Parameters and Statistics Population parameters ( ) – = population mean – = population variance = average size of in population Hypotheses: Sample statistics ( ) –Sample: – = sample mean – = sample variance

Continuous Distributions A continuous random variable can take values with any number of decimals like The probability of a continuous r.v. taking on an exact value is 0. But there is a nonzero chance that continuous r.v. will take on a value in an interval. Density function defines probability for continuous r.v. The probability that a r.v. takes on values between 3.9 and 6.2 is the area under the density function between 3.9 and 6.2. Total area under density function is 1. Example of continuous r.v.: height.

Normal probability distribution The normal probability distributions are a family of density functions for a continuous r.v. that are “bell-shaped.” The normal probability distribution has two parameters, mean and standard dev. The probability that a normal r.v. will be within one s.d. of its mean is about 68%. The probability that a normal r.v. will be within two s.d.’s of its mean is about 95%.

Sampling distribution of sample mean Consider random sample of size n from a population with mean and variance Key facts about sampling distribution of. –Center: The mean of the sampling distribution of is –Spread: Sample means are closer to the population mean than single values. The sampling distribution has. –Shape: If the population distribution is normal, the sampling distribution of the sample mean will be normal. If the population distribution is not normal, the sampling distribution of the sample mean will be nearly normal for n>30 (Central Limit Theorem).

Standard errors The standard error of a statistic is an estimate of the standard deviation in its sampling distribution. It is the best guess of likely size of difference between a statistic used to estimate parameter and parameter itself. Associated with every standard error is a measure of the amount of information used to estimate variability, called its degrees of freedom. Degrees of freedom are measured in units of “equivalent numbers of independent observations.” Standard error of sample mean: d.f. = n-1

Testing a hypothesis about Could the difference of from (the hypothesized value for, =0 here ) be due to chance (in random sampling)? Test statistic:. The test statistic will tend to be near 0 when H 0 is true and far from 0 when H 0 is false. Assume the population distribution is normal. If H 0 is true, then t has the Student’s t-distribution with n-1 degrees of freedom.

P-value The (2-sided) p-value is the proportion of random samples with absolute value of t ratios >= observed test statistic (|t|) Schizophrenia example: t = 3.23

Schizophrenia Example p-value (2-sided, paired t-test) =.006 So either, –(i) the null hypothesis is incorrect OR –(ii) the null hypothesis is correct and we happened to get a particularly unusual sample (only 6 out of 1000 are as unusual) Strong evidence against One-sided test: –Test statistic: –For schizophrenia example, t=3.21, p-value (1-sided) =.003

Matched pairs t-test in JMP Click Analyze, Matched Pairs, put two columns (e.g., affected and unaffected) into Y, Paired Response. Can also use one-sample t-test. Click Analyze, Distribution, put difference into Y, columns. Then click red triangle under difference and click test mean.

Confidence Intervals Point estimate: a single number used as the best estimate of a population parameter, e.g., for. Interval estimate (confidence interval): range of values used as an estimate of a population parameter. Uses of a confidence interval: –Provides a range of values that is “likely” to contain the true parameter. Confidence interval can be thought of as the range of values for the parameter that are “plausible” given the data. –Conveys precision of point estimate as an estimate of population parameter.

Confidence interval construction A confidence interval typically takes the form: point estimate margin of error The margin of error depends on two factors: –Standard error of the estimate –Degree of “confidence” we want. –Margin of error = Multiplier for degree of confidence * SE of estimate –For a 95% confidence interval, the multiplier for degree of confidence is about 2 in most cases.

CI for population mean If the population distribution of Y is normal (* we will study the if part later) 95% CI for mean of single population: For schizophrenia data:

Interpretation of CIs A 95% confidence interval will contain the true parameter (e.g., the population mean) 95% of the time if repeated random samples are taken. It is impossible to say whether it is successful or not in any particular case, i.e., we know that the CI will usually contain the true mean under random sampling but we do not know for the schizophrenia data if the CI (0.067cm 3,0.331cm 3 ) contains the true mean difference.

Confidence Intervals in JMP For both methods of doing paired t-test (Analyze, Matched Pairs or Analyze, Distribution), the 95% confidence intervals for the mean are shown on the output.