Presentation on theme: "Partly based on material by Sherry O’Sullivan"— Presentation transcript:
1 Partly based on material by Sherry O’Sullivan Research MethodsPart 4T- StatisticsPartly based on material by Sherry O’Sullivan
2 Revision General terms Population Sample Parameter Statistic Measures of central tendencyMeanMedianModeMeasures of spreadRangeInter-quartile rangeVarianceStandard deviationPopulationSample
3 Revision of notationNumbers describing a population are called parametersNotation uses Greek lettersPopulation mean = μPopulation standard deviation = σNumbers describing a sample are called statisticsNotation uses ordinary lettersSample mean =Sample standard deviation = s
4 Revision: Z - ScoresA specific method for describing a specific location within a distributionUsed to determine precise location of an in individual score within the distributionUsed to compare relative positions of 2 or more scores
5 Revision: Standard Deviation Measures the spread of scores within the data setPopulation standard deviation is used when you are only interested in your own dataSample standard deviation is used when you want to generalise from your sample to the rest of the population
7 Normal distribution Many data sets follow a Normal distribution Defined mathematically by its mean and standard deviationMany statistical tests assume that data follows the Normal distributionStrictly, you can’t use these tests unless you can show that your data follows a Normal distribution
8 Other possible distributions Poisson distribution – for very rare eventse.g. number of BSCs (blue screen crashes) per hour of computer useMean is small, often less than 1Mode and median often zeroBinomial distributionVery similar to the Normal distribution, but a discrete distribution (as opposed to a continuous distribution)There are lots of others…
10 Distribution of the sample Means (simple example) Frequency Distribution of 4 scores (2, 4, 6, 8)X123456789Distribution looks flat and not bell shaped(actually not enough data to decide what the distribution might be)Mean of population is ( )/4 = 5It is clear that this distribution is not normal (it’s flat and not “bell-shaped”).
11 Distribution of the sample means Take all possible samples of two scoresCalculate average for each sample(2+2)/2 = 2(2+4)/2 = 3(2+6)/2 = 4(2+8)/2 = 5(4+2)/2 = 3(4+4)/2 = 4(4+6)/2 = 5(4+8)/2 = 6(6+2)/2 = 4(6+4)/2 = 5(6+6)/2 = 6(6+8)/2 = 7(8+2)/2 = 5(8+4)/2 = 6(8+6)/2 = 7(8+8)/2 = 8Now let’s take all possible samples with n=2 in other words all possible samples of pairs of scores. We also agree to use random sampling where each individual sample is replaced into the data set. We compute the averages of all sample pairs. So, for example, we get average(2 + 4) = 3 and average(4 + 2) = 3. We get average(2 + 2) = 2 and so on.XXXXXXXXXXXXXXXX
12 Central Limit Theorem“For any population with a mean μ and standard deviation σ , the distribution of sample means for sample size n will have a mean of μ and standard deviation of σ/√n and will approach a normal distribution as n gets very large.”How big should the sample size be? n=30X123456789So it looks as though, through the process of sampling, we are able to discover the population mean. This is a very important result, and is the bed-rock of statistical analysis. It also has a mathematical description, which is summarized in the “Central Limit Theorem” Colin’s notesSo this means that the z-test approach is applicable to data which is not normally distributed, provided that we take samples, and calculate their means, when the sample size is big enough. How big? Well n=30 will usually do.
13 Standard Errorσ/√n is used to calculate the Standard Error of the sample meanSample data = xThe mean of each sample =Then the standard error becomesIt identifies how much the observed sample mean is likely to differ from the un-measurable population mean μ.So to be more confident that our sample mean is a good measure of the population mean, then the standard error should be small. One way we can ensure this is to take large samples (large n).
14 ExampleThe population of SATs scores is normal with μ= 500, σ =100. What is the chance that a sample of n=25 students has a mean score = 540? Since the distribution is normal, we can use the z-score.First calculate the Standard Error: = 100/5 = 20Then the Z-Score: = ( )/20 =2The z-value is 2, therefore around 98% of the sample means are below this and only 2% are above.So we conclude that the chance of getting a sample mean of 540 or more is about 2%,so we are about 98% confident that this sample mean (if recorded in an experiment) is not due to random variation, but that the 25 students are (on average) brighter than average.
15 t - StatisticsSo far we’ve looked at mean and sd of populations and our calculations have had parametersBut how do we deduce something about the population using our sample?We can use the t-Statistic
16 t - Statistics Remember SD from last week? Great for population of N but not for sample of nWhy n -1?Because we can only freely choose n-1 (Degree of freedom = df)Show example
17 t - Statistics Standard Error t statistic is z-score redone using the above:And for the t-statistic, we substitute σ (SD of population) with s (SD of sample)But what about μ ?An example…
18 Hypothesis Testing Sample of computer game players n =16 Intervention = inclusion of rich graphical elementsLevel has 2 roomsRoom A = lots of visualsRoom B = very blandPut them in level 60 minutesRecord how long they spend in B
19 Results Average time spent in B = 39 minutes Observed “sum of squares” for the sample is SS = 540.H0: Here we formulate the “null” hypothesis, that the visuals have no effect on the behaviour.H1: Here we formulate the “alternative” hypothesis, that the visuals do have an effect on the players’ behaviour.AB
20 Stage1: Formulation of Hypothesis H0: “null hypothesis”, that the visuals have no effect on the behaviour.H1: “alternate hypothesis”, that the visuals do have an effect on the players’ behaviour.If visuals have no effect, how long on average should they be in room B?Null hypothesis is crucial; here we can infer that μ = 30 and get rid of the population meanThe null hypothesis is crucial, since it helps us to “get rid” of the population parameter . Think about it. If visuals have no effect on the population then what is the average time the player will spend in room B? Clearly half the time, so we have inferred from the null hypothesis that u =30.
21 Stage 2: Locate the critical region We use the t-table to help us locate this, enabling us to reject or accept the null hypothesis. To get we need:Number of degree of freedom (df) =15We choose a significance or a level of confidence: α = 0.05 (95% confidence)Locate in t-table (2 tails): critical value of t=2.131,
22 Stage 3: Calculate statistics Calculate sample sd = 6Sample Standard Error= 6 / 4 =1.5The μ = 30 came from the null hypothesis: if visuals had no effect, then the player would spend 30 minutes in both rooms A and B.t-Statistic= 6
23 Stage 4: DecisionCan we reject the H0, that the visuals have no effect on the behaviour?t = 6 which is well beyond the value of which indicates where chance kicks in.So “yes”, we can safely reject it and say it does affect behaviourWhich room do they prefer?They spent on average 39 minutes in Room B which is bland
24 Another use of t (part 1)Our example was a comparison of what was observed with what was expectedOur analysis gave a confidence with which the observations were different from the expectedNote: cannot be used to confirm similarity…Another use of t: comparison of two samplese.g. male and female performance on a game, or opinion of a website….
25 Another use of t (part 2)In this case, we have two sample means, and we are testing for a differenceRecall: before, we hadThis time, it gets messy because the two standard deviations might not be the same, and we finish up with
26 Yet another complication: When looking for a difference, we may have no reason to suppose that one sample gives higher values than the otherWe don’t know which mean might be higherThis is a two-tailed test: we test both tails of the distributionIf we have a good reason to suppose that one set of results has to be higher than the othere.g. game scores before and after a practice sessionThen we have a one-tailed test