Presentation is loading. Please wait.

Presentation is loading. Please wait.

Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis.

Similar presentations


Presentation on theme: "Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis."— Presentation transcript:

1 Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis

2 The focus of this module is to progress your statistical knowledge beyond descriptive statistics. This module deals with inferential statistics used to infer conclusions from the data and make generalisations to the populations. Inferential statistics: parametric non-parametric Introduction

3 Inferring conclusions from your data and make generalisations to the wider population involves testing hypotheses. What is a hypothesis? statement or proposal about relationships between variables of differences between groups that can be tested. example: socio-economic group is associated with occupation Hypotheses

4 Logic of hypothesis testing: Assessing whether collected data/observations are a real effect or due to statistical chance Hypotheses Null hypothesis(H o ) Position of no difference and any observed differences are by chance Alternative hypothesis (H 1 ) Prediction/there is a difference. The opposite of Ho and that any observed differences in the data can be generalised to the population

5 Type I : incorrect rejection of Ho Type II : incorrect acceptance of Ho 5 Minimise Type I by: Setting correct significance level (0.05 rather than 0.01) Minimise Type II by Collecting more data Reducing the significance level (0.01 to 0.05) Place a direction of the hypothesis (rather than stating a difference instead state a direction) Change measurement tool Errors in hypothesis testing

6 6 Types of statistical tests Inferential statistical test can be divided broadly into 3 main groups: 1. whether observed differences between two sets of scores are statistically significant 2. association between two sets of scores or variables 3. compare more than two sets of scores

7 7 Types of statistical tests Test used will depend on data type (nominal, ordinal, scale) Tests classified as: Parametric (scale and normally distributed data) Non parametric (nominal/ordinal and/or break assumptions of normal distribution)

8 8 Types of statistical tests: parametric Parametric tests require scale (interval/ratio) And, also: The sample must be representative of the target population so that the variables being measured fall within the normal distribution for that population The variables must have been measured in a manner that generates interval or ratio data The subjects in the two groups being examined need to be either randomly assigned to each group or each group must be matched according to the respondents’ age, sex, etc

9 9 Types of statistical tests: non parametric When parametric tests conditions not met then non parametric tests are used, so where: The sample is not considered representative of the population and the variables selected are probably not normally distributed (i.e. random selection has not occurred) The variables have been measured in a way that generates categorical or ordinal data

10 Normal distribution Choosing the inferential statistic to use depends on understanding how variables to be measured are distributed. understanding the normal distribution relationship with the principles of the central limit theorem Central limit theorem - when repeated successive random samples are taken from a population, the distribution of the sample means calculated for each sample will become approximately normally distributed.

11 The Central Limit Theorem Important assumptions: When there is a normal distribution of a variable in a population, the sampling distribution of the mean will be normal When the population distribution is not normal, Central Limit Theorem states that as sample size increases the sampling distribution of the mean becomes normal Taking repeated (large) samples and calculating the mean of the sample means will equal the population mean While the standard error* of a sampling distribution is unknown we can use the standard deviation of a sample as an estimate.

12 A further feature of the central limit theorem is that the mean is calculated from all the sample means, its value will approximately equal the population mean. So, can calculate a standard deviation. how far on average each value is from the mean. Standard deviation

13 Standard deviation of the sample means is the standard error (SE) Standard deviation The smaller the value of the SE of the mean, the better the sample mean is as an estimate of the population mean. In practice, unlikely to know SD. But, distribution of the sample means approximates a normal distribution curve so can use normal distribution properties to calculate the likely range that the population mean will fall in, estimated on the sample mean....confidence intervals.

14 The level of confidence – a measure of how statistically confident we are that the calculated measurement is correct. It is normally expressed as a percentage – often 95%. The principles of a normal distribution curve states that 95% of the area under the curve, or cases within our sample population, fall between + or - 1.96 standard deviations from the mean. Knowing our sample mean we then look to calculate the range within which we would expect the population mean to fall from our sample mean with a level of confidence of 95%. Confidence intervals

15 Level of significance is 100% minus the level of confidence, for example 5%. Statistically expressed as p p 0.05 (more than 5%) The confidence intervals are the measure of the upper and lower range of values in which we would expect a known population parameter to occur with a stated level of statistical significance. Confidence intervals

16 Confidence interval equation Confidence intervals Sample mean Standard deviation Count Further summarised As sample mean is best approximate representation of true population mean we use SE of the mean. Calculate confidence interval at 95% multiplying SE(x) by +1.96 and -1.96

17 Worked example The average price of oranges from a sample of 150 orange prices with a mean price of 32.5 pence and a standard deviation of 5.5 pence. SE = 0.449 pence Sample mean as best estimate of true population can calculate confidence interval at 95% 32.5 + 1.96 x 0.449 = 32.5 + 0.88 at the 95% level (31.62, 33.38) So, 95% confident that the population average price of oranges will lie between 31.62 pence and 33.38 pence. Confidence intervals

18 The shape of the normal distribution is determined by the mean and standard deviation. Many scale variables approximate to the normal distribution curve Properties of the normal distribution Calculate areas under the curve. Use to estimate proportion of cases above/below a stated point. For example. Mean Age of 55 years and a std dev of 6 years. We can estimate that 95.4 % of cases will fall between +/- 2 std dev from the mean, or between 43 years and 67 years.

19 Fielding, J. and Gilbert, N. (2006) Understanding social statistics. 2 nd ed. London: Sage. David, M. and Sutton, C. (2011) Social Research : An Introduction. 2nd ed. London: Sage. References

20 This resource was created by the University of Plymouth, Learning from WOeRk project. This project is funded by HEFCE as part of the HEA/JISC OER release programme.Learning from WOeRk This resource is licensed under the terms of the Attribution-Non-Commercial-Share Alike 2.0 UK: England & Wales license (http://creativecommons.org/licenses/by-nc-sa/2.0/uk/).http://creativecommons.org/licenses/by-nc-sa/2.0/uk/ The resource, where specified below, contains other 3 rd party materials under their own licenses. The licenses and attributions are outlined below: 1.The name of the University of Plymouth and its logos are unregistered trade marks of the University. The University reserves all rights to these items beyond their inclusion in these CC resources. 2.The JISC logo, the and the logo of the Higher Education Academy are licensed under the terms of the Creative Commons Attribution -non-commercial-No Derivative Works 2.0 UK England & Wales license. All reproductions must comply with the terms of that license. Author Laura Lake InstituteUniversity of Plymouth Title Advanced Quantitative Analysis Description Introduction to Statistical Inference & Hypothesis Testing Date Created July 2011 Educational Level Postgraduate (Level 7) Keywords Parametric, non parametric, null and alternative hypothesis, normal distribution, central limit theorem, standard deviation, confidence intervals., UKOER, LFWOER, CPD, Learning from WOeRK, UOPCPDRM, Continuous professional development, Quantitative HEA, JISC, HEFCE Back page originally developed by the OER phase 1 C-Change project ©University of Plymouth, 2010, some rights reserved


Download ppt "Numeracy & Quantitative Methods: Level 7 – Advanced Quantitative Analysis."

Similar presentations


Ads by Google