Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”

Similar presentations


Presentation on theme: "Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”"— Presentation transcript:

1 Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”

2 Outline What do we mean by statistics? Probability The idea of distributions The normal distribution Testing for significance

3 Statistics Mathematical science –Collection –Analysis –Interpretation –Presentation of data Two kinds –Descriptive statistics (last session) –Inferential statistics Hypothesis testing Estimation of characteristics of data Correlation - measures of association Regression - modeling relationships

4 Roll the dice What happens when we roll the dice? What are the chances that any one number will come up? Let’s plot the frequencies….

5 Frequency distributions: a reminder

6 Notions of randomness If we rolled the (fair) die enough times we would expect each number to have the same frequency (the population) Our ‘experiment’ is a sample of this population And it may have ‘random’ variation around the expected values –Or might even be a biased die! Key question: could our sample be drawn from a population about which we know things?

7 Discussion When might we want to know if our sample could belong to a population?

8

9 Probability theory: an overview How probable or likely is something to happen? –The lottery! How do we know? –Based on past experience –Undertaking an experiment –Gathering sample data to make a prediction Does require some mathematical understanding –But core concepts are easy!

10 The normal distribution Family of probability distributions –Also know as Gaussian distribution Many phenomena are distributed like this –Heights –Intelligence –Measurement errors

11 Probability theory: an overview The normal distribution (frequencies) –Can be specified in terms of two parameters: mean  and standard deviation  Standard normal distribution –68.26% lies within  +/- 1 –95.7% within  +/- 2 Many phenomena are distributed normally –Samples can be tested to see whether they could be drawn from such a distribution

12 Probability theory: in practice If a distribution is normal, we can predict the percentage likelihood of an event happening Thus a 50% chance of scoring less than the mean But only a 2.15% chance of scoring above two standard deviations above the mean The key thing is to know if our sample represents the population as a whole It is possible to use a range of statistical tests based on this

13 The normal distribution

14 Discussion/Questions

15 Types of error and degrees of freedom Degrees of freedom (df) –Number of independent bits of information –If you know the total of an addition, the df is thus n-1 Types of error –Type 1: A false positive Rejecting the null hypothesis when it is actually true –Type 2: A false negative Failing to reject the null hypothesis when the alternative is actually true

16 One and two-tailed tests Test statistics have frequency distributions A hypothesis is rejected when the value of the statistic is too small or too large –Two tailed Null hypothesis is rejected for values of test statistic that fall into either tail –One tailed Null hypothesis is rejected for values of test statistic that fall into one tail

17 Significance levels  is the term used to define significance levels H 0 is rejected if the calculated value of the test statistic falls within the defined region Convention - usually use –0.05 (5%) or 0.01 (1%) levels –Probability that this could have occurred by chance

18 Student’s t-test Developed by William Gosset (1876-1937) –Developed test for monitoring the quality of beers at Guinness brewery –His pen name was ‘Student’ Hence the name of the test!

19 Student’s t-test Why might we use it? –To see whether the means of two normally distributed populations are equal –To see whether the mean of a normally distributed population has a value specified in a null hypothesis –Also used to see whether the slope of a regression line differs significantly from 0 Assumptions –Normally distributed data –Equal standard deviations –Can be used for independent or dependent samples

20 Basic structure of t-test A measure of difference between means, divided by a measure of the standard deviation and sample size Different kinds of t-test Most simple testing whether a sample mean is equal to a specific value Where  0 is specific value for mean, s is sample standard deviation, and n is sample size

21 t for independent samples t is the difference between the means divided by the standard error of the difference –Standard error = standard deviation divided by square root of the sample size t is basically a way of relating means, standard deviations and sample size Numerator is the difference between the means Denominator is the standard error for difference of the two means Degrees of freedom = n 1 + n 2 - 2 –Where n is the number of participants per group

22 Worked example Mean % examination results of schools in different towns (From Robinson, 1998) Using simplified formula for calculation  x 1 -  x 2 t = -----------------------   (1/n 1 + 1/n 2 )

23 Student’s t-distribution Tabulated values, based on –Probability (usually choose 0.05 level) –Degrees of freedom If obtained value is larger than tabulated value –Reject the null hypothesis –Accept that there is a difference between means

24 Parametric and non- parametric statistics Parametric –Data on interval or ratio scale –Populations from which samples are drawn are normally distributed –Equal standard deviations –Observations are independent of each other –‘Powerful’ Non-parametric –Ordinal or nominal data –Not based on population parameters –No assumptions made about frequency distributions of variables –Less ‘powerful’

25 Is Student’s t-test a parametric or non- parametric test?

26 Opportunity for general discussion on statistics and probability

27 What you should have gained from this session Why statistics? The idea of ‘probability’ The value of the normal distribution A simple test - the t-test Parametric and non-parametric tests


Download ppt "Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”"

Similar presentations


Ads by Google