P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland.

Slides:



Advertisements
Similar presentations
Chapter 16 Inferential Statistics
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Anthony Greene1 Simple Hypothesis Testing Detecting Statistical Differences In The Simplest Case:  and  are both known I The Logic of Hypothesis Testing:
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
Introduction to Statistics
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
What z-scores represent
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Chapter Sampling Distributions and Hypothesis Testing.
Chapter 3 Hypothesis Testing. Curriculum Object Specified the problem based the form of hypothesis Student can arrange for hypothesis step Analyze a problem.
Inferential Statistics
1. Statistics: Learning from Samples about Populations Inference 1: Confidence Intervals What does the 95% CI really mean? Inference 2: Hypothesis Tests.
AM Recitation 2/10/11.
Overview of Statistical Hypothesis Testing: The z-Test
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Chapter 8 Hypothesis Testing 8-1 Review and Preview 8-2 Basics of Hypothesis.
Overview Definition Hypothesis
Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.
Sampling Distributions and Hypothesis Testing. 2 Major Points An example An example Sampling distribution Sampling distribution Hypothesis testing Hypothesis.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Lesson Carrying Out Significance Tests. Vocabulary Hypothesis – a statement or claim regarding a characteristic of one or more populations Hypothesis.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Jan 17,  Hypothesis, Null hypothesis Research question Null is the hypothesis of “no relationship”  Normal Distribution Bell curve Standard normal.
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
EDUC 200C Friday, October 26, Goals for today Homework Midterm exam Null Hypothesis Sampling distributions Hypothesis testing Mid-quarter evaluations.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
P Values - part 3 The P value as a ‘statistic’ Robin Beaumont 1/03/2012 With much help from Professor Geoff Cumming.
P Values - part 4 The P value and ‘rules’ Robin Beaumont 10/03/2012 With much help from Professor Geoff Cumming.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistics. Key statistics and their purposes Chi squared test: determines if a data set is random or accounted for by an unwanted variable Standard deviation:
Lesson Testing Claims about a Population Mean Assuming the Population Standard Deviation is Known.
1 Lecture note 4 Hypothesis Testing Significant Difference ©
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
P Values - part 2 Samples & Populations Robin Beaumont 11/02/2012 With much help from Professor Chris Wilds material University of Auckland.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Statistical Inference Statistical Inference involves estimating a population parameter (mean) from a sample that is taken from the population. Inference.
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
PSY 307 – Statistics for the Behavioral Sciences Chapter 9 – Sampling Distribution of the Mean.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Chapter 8 Hypothesis Testing I. Significant Differences  Hypothesis testing is designed to detect significant differences: differences that did not occur.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Testing the Differences between Means Statistics for Political Science Levin and Fox Chapter Seven 1.
Hypothesis Testing Errors. Hypothesis Testing Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean.
P Values Robin Beaumont 8/2/2012 With much help from Professor Chris Wilds material University of Auckland.
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Various Topics of Interest to the Inquiring Orthopedist Richard Gerkin, MD, MS BGSMC GME Research.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Assumptions 1) Sample is large (n > 30) a) Central limit theorem applies b) Can.
P Values - part 2 Samples & Populations Robin Beaumont 2011 With much help from Professor Chris Wilds material University of Auckland.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
 What is Hypothesis Testing?  Testing for the population mean  One-tailed testing  Two-tailed testing  Tests Concerning Proportions  Types of Errors.
Chapter 9 -Hypothesis Testing
More on Inference.
Hypothesis Testing: One Sample Cases
Hypothesis Testing: Hypotheses
Hypothesis Testing Summer 2017 Summer Institutes.
CONCEPTS OF HYPOTHESIS TESTING
Professor Chris Wilds material University of Auckland
More on Inference.
Interval Estimation and Hypothesis Testing
Presentation transcript:

P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland

Where do they fit in!

Putting it all together

Populations and samples Ever constant at least for your study! = Parameter estimate = statistic

One sample

Size matters – single samples

Size matters – multiple samples

We only have a rippled mirror

Standard deviation - individual level = measure of variability 'Standard Normal distribution' Total Area = = SD value 68% 95% 2 Area: Between + and - three standard deviations from the mean = 99.7% of area Therefore only 0.3% of area(scores) are more than 3 standard deviations ('units') away. - But does not take into account sample size = t distribution Defined by sample size aspect ~ df Area! Wait and see

Sampling level -‘accuracy’ of estimate From: = 5/√5 = SEM = 5/√25 = 1 We can predict the accuracy of your estimate (mean) by just using the SEM formula. From a single sample Talking about means here

Example - Bradford Hill, (Bradford Hill, 1950 p.92) mean systolic blood pressure for 566 males around Glasgow = mm. Standard deviation =13.05 Determine the ‘precision’ of this mean. “We may conclude that our observed mean may differ from the true mean by as much as ± (.5485 x 4) but not more than that in around 95% of observations. page 93. [edited]

Sampling summary The SEM formula allows us to: predict the accuracy of your estimate ( i.e. the mean value of our sample) From a single sample Assumes Random sample

Variation what have we ignored! Onto Probability now

Probabilities are rel. frequencies All outcomes at any one time = 1

Multiple outcomes at any one time Probability Density Function Scores Probability The total area = 1 total 48 scores Density p(score<45) = area A A p(score > 50) = area B B P(score 50) = Just add up the individual outcomes

= Conditional Probability Male P(male) female No Disease X Disease X No Disease X Disease X AND Male What happens in the past affects the present Multiple each branch of the tree to get end value Disease X P(disease x |male) P(disease AND male) = P(male) x P(disease x | male) P(disease AND male) /P(male) = P(disease x | male)

Screening Example 0.1% of the population carry a particular faulty gene. A test exists for detecting whether an individual is a carrier of the gene. In people who actually carry the gene, the test provides a positive result with probability 0.9. In people who don’t carry the gene, the test provides a positive result with probability Let G = person carries gene P = test is positive for gene N = test is negative for gene Errors If someone gets a positive result when tested, find the probability that they actually are a carrier of the gene. We want to find P(P) = P(G and P) + P(G' and P) = = P( P | G) P(P | G) ≠ P (G | p) ORDER MATTERS

Survival analysis Each years survival depends on previous ones or does it?

Probability summary All outcomes at any one time add up to 1 Probability histogram = area under curve =1 -> specific areas = set of outcomes Conditional probability – present dependent on past – ORDER MATTERS

Putting it all together

Statistics Summary measure – SEM, Average etc T statistic – different types, simplest: So when t = 0 means 0/anything = estimated and hypothesised population mean are equal So when t = 1 observed different same as SEM So when t = 10 observed different much greater than SEM

T statistic example Serum amylase values from a random sample of 15 apparently healthy subjects. The mean = 96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) This looks like a rare occurrence? But for what A population value = the null hypothesis

t density:s x = n = t Shaded area = Original units: 0 Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) What does the shaded area mean! Given that the sample was obtained from a population with a mean of 120 a sample with a T (n=15) statistic of or or one more extreme will occur 1.8% of the time = just under two samples per hundred on average... Given that the sample was obtained from a population with a mean of 120 a sample of 15 producing a mean of 96 (120-x where x=24) or 144 (120+x where x=24) or one more extreme will occur 1.8% of the time, that is just under two samples per hundred on average. But it this not a P value p = 2 · P(t (n−1) < t| H o is true) = 2 · [area to the left of t under a t distribution with df = n − 1]

P value and probability for t statistic p value = 2 x P(t (n-1 ) values more extreme than t (n-1 ) | H o is true ) = 2 · [area to the left of t under a t distribution with n − 1 shape] A p value is a special type of probability with: Multiple outcomes + conditional upon the specified parameter value

Putting it all together Do we need it!

Rules t density:s x = n = t Shaded area = Original units: 0 Set a level of acceptability = critical value (CV)! Say one in twenty 1/20 = Or 1/100 Or 1/1000 or.... If our result has a P value of less than our level of acceptability. Reject the parameter value. Say 1 in 20 (i.e.CV=0.5) Given that the sample was obtained from a population with a mean (parameter value) of 120 a sample with a T (n=15) statistic of or or one more extreme with occur 1.8% of the time, This is less than one in twenty therefore we dismiss the possibility that our sample came from a population mean of What do we replace it with?

Fisher – only know and only consider the model we have i.e. The parameter we have used in our model – when we reject it we accept that any value but that one can replace it. Neyman and Pearson + Gossling Must have an alternative specified value for the parameter

If there is an alternative - what is it – another distribution! Power – sample size Affect size – indication of clinical importance: Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted)

α = the reject region = 120 = 96 Correct decisions incorrect decisions

Insufficient power – never get a significant result even when effect size large Too much power get significant result with trivial effect size

Life after P values Confidence intervals Effect size Description / analysis Bayesian statistics - qualitative approach by the back door! Planning to do statistics for your dissertation? see: My medical statistics courses: Course 1: YouTube videos to accompany course 1: Course 2: YouTube videos to accompany course 2:

Your attitude to your data

Where do they fit in!