P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland.

P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland

Where do they fit in!

Putting it all together

Populations and samples Ever constant at least for your study! = Parameter estimate = statistic

One sample

Size matters – single samples

Size matters – multiple samples

We only have a rippled mirror

Standard deviation - individual level = measure of variability 'Standard Normal distribution' Total Area = 1 0 1 = SD value 68% 95% 2 Area: Between + and - three standard deviations from the mean = 99.7% of area Therefore only 0.3% of area(scores) are more than 3 standard deviations ('units') away. - But does not take into account sample size = t distribution Defined by sample size aspect ~ df Area! Wait and see

Sampling level -‘accuracy’ of estimate From: http://onlinestatbook.com/stat_sim/sampling_dist/index.htmlhttp://onlinestatbook.com/stat_sim/sampling_dist/index.html = 5/√5 = 2.236 SEM = 5/√25 = 1 We can predict the accuracy of your estimate (mean) by just using the SEM formula. From a single sample Talking about means here

Example - Bradford Hill, (Bradford Hill, 1950 p.92) mean systolic blood pressure for 566 males around Glasgow = 128.8 mm. Standard deviation =13.05 Determine the ‘precision’ of this mean. “We may conclude that our observed mean may differ from the true mean by as much as ± 2.194 (.5485 x 4) but not more than that in around 95% of observations. page 93. [edited]

Sampling summary The SEM formula allows us to: predict the accuracy of your estimate ( i.e. the mean value of our sample) From a single sample Assumes Random sample

Variation what have we ignored! Onto Probability now

Probabilities are rel. frequencies All outcomes at any one time = 1

Multiple outcomes at any one time Probability Density Function Scores Probability 0 1 2 3 4 5 6 7 8 9 10 11 333743475357636773778387 The total area = 1 total 48 scores Density p(score<45) = area A A p(score > 50) = area B B P(score 50) = Just add up the individual outcomes

= Conditional Probability Male P(male) female No Disease X Disease X No Disease X Disease X AND Male What happens in the past affects the present Multiple each branch of the tree to get end value Disease X P(disease x |male) P(disease AND male) = P(male) x P(disease x | male) P(disease AND male) /P(male) = P(disease x | male)

Screening Example 0.1% of the population carry a particular faulty gene. A test exists for detecting whether an individual is a carrier of the gene. In people who actually carry the gene, the test provides a positive result with probability 0.9. In people who don’t carry the gene, the test provides a positive result with probability 0.01. Let G = person carries gene P = test is positive for gene N = test is negative for gene Errors If someone gets a positive result when tested, find the probability that they actually are a carrier of the gene. We want to find P(P) = P(G and P) + P(G' and P) = 0.0009 + 0.00999 = 0.01089 P( P | G) P(P | G) ≠ P (G | p) ORDER MATTERS

Survival analysis Each years survival depends on previous ones or does it?

Probability summary All outcomes at any one time add up to 1 Probability histogram = area under curve =1 -> specific areas = set of outcomes Conditional probability – present dependent on past – ORDER MATTERS

Putting it all together

Statistics Summary measure – SEM, Average etc T statistic – different types, simplest: So when t = 0 means 0/anything = estimated and hypothesised population mean are equal So when t = 1 observed different same as SEM So when t = 10 observed different much greater than SEM

T statistic example Serum amylase values from a random sample of 15 apparently healthy subjects. The mean = 96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) This looks like a rare occurrence? But for what A population value = the null hypothesis

t density:s x = 9.037 n =15 0 120 96 -2.656 t 2.656 Shaded area =0.0188 Original units: 0 Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted) What does the shaded area mean! Given that the sample was obtained from a population with a mean of 120 a sample with a T (n=15) statistic of - 2.656 or 2.656 or one more extreme will occur 1.8% of the time = just under two samples per hundred on average... Given that the sample was obtained from a population with a mean of 120 a sample of 15 producing a mean of 96 (120-x where x=24) or 144 (120+x where x=24) or one more extreme will occur 1.8% of the time, that is just under two samples per hundred on average. But it this not a P value p = 2 · P(t (n−1) < t| H o is true) = 2 · [area to the left of t under a t distribution with df = n − 1]

P value and probability for t statistic p value = 2 x P(t (n-1 ) values more extreme than t (n-1 ) | H o is true ) = 2 · [area to the left of t under a t distribution with n − 1 shape] A p value is a special type of probability with: Multiple outcomes + conditional upon the specified parameter value

Putting it all together Do we need it!

Rules t density:s x = 9.037 n =15 0 120 96 -2.656 t 2.656 Shaded area =0.0188 Original units: 0 Set a level of acceptability = critical value (CV)! Say one in twenty 1/20 = Or 1/100 Or 1/1000 or.... If our result has a P value of less than our level of acceptability. Reject the parameter value. Say 1 in 20 (i.e.CV=0.5) Given that the sample was obtained from a population with a mean (parameter value) of 120 a sample with a T (n=15) statistic of -2.656 or 2.656 or one more extreme with occur 1.8% of the time, This is less than one in twenty therefore we dismiss the possibility that our sample came from a population mean of 120.. What do we replace it with?

Fisher – only know and only consider the model we have i.e. The parameter we have used in our model – when we reject it we accept that any value but that one can replace it. Neyman and Pearson + Gossling Must have an alternative specified value for the parameter

If there is an alternative - what is it – another distribution! Power – sample size Affect size – indication of clinical importance: Serum amylase values from a random sample of 15 apparently healthy subjects. mean =96 SD= 35 units/100 ml. How likely would such a sample be obtained from a population of serum amylase determinations with a mean of 120. (taken from Daniel 1991 p.202 adapted)

α = the reject region = 120 = 96 Correct decisions incorrect decisions

Insufficient power – never get a significant result even when effect size large Too much power get significant result with trivial effect size

Life after P values Confidence intervals Effect size Description / analysis Bayesian statistics - qualitative approach by the back door! Planning to do statistics for your dissertation? see: My medical statistics courses: Course 1: www.robin-beaumont.co.uk/virtualclassroom/stats/course1.html YouTube videos to accompany course 1: http://www.youtube.com/playlist?list=PL9F0EBD42C0AB37D0 Course 2: www.robin-beaumont.co.uk/virtualclassroom/stats/course2.html YouTube videos to accompany course 2: http://www.youtube.com/playlist?list=PL05FC4785D24C6E68

Your attitude to your data

Where do they fit in!

P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland.

Similar presentations

Presentation on theme: "P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland.

Similar presentations

Presentation on theme: "P Values Robin Beaumont 10/10/2011 With much help from Professor Chris Wilds material University of Auckland."— Presentation transcript:

Similar presentations

About project

Feedback