Download presentation

Presentation is loading. Please wait.

Published byLuke Salton Modified over 2 years ago

1
Analysing and Reporting Quantitative Data – Part II SI0124 Introduction to Social Science Research Week 4 Luke Sloan SI0124 Introduction to Social Science Research Week 4 Luke Sloan

2
Introduction Last Week – A Recap Formulating Hypotheses Social Capital Dataset Chi-Square Test For Independence

3
Last Week – A Recap Levels of measurement Central tendency (a heuristic) Dispersion (a critical tool) But these are only ‘descriptives’ of single variables How do we test for relationships between variables?...

4
Formulating Hypotheses I “An untested assertion about the relationship between two or more variables. The validity of such an assertion is assessed by examining the extent to which it is, or is not, supported by data generated by empirical enquiry.” Source: Jupp 2006:137 So what does this actually mean?

5
Formulating Hypotheses II Related to the ‘Research Question’ (RQ) But the RQ itself does not offer an approach to researching a phenomena, it only ‘identifies’ it Hypotheses allow dissection of larger questions Research Question:Why do students not progress to Higher Education? Hypothesis 1 (H 1 ):Participation in Higher Education is related to social class Hypothesis 2 (H 2 ):Female students are more likely to go into Higher Education Hypothesis 3 (H 3 ):Lack of parental involvement in Higher Education reduces the likelihood of student progression to Higher Education Hypothesis 1 (H 1 ):Participation in Higher Education is related to social class Hypothesis 2 (H 2 ):Female students are more likely to go into Higher Education Hypothesis 3 (H 3 ):Lack of parental involvement in Higher Education reduces the likelihood of student progression to Higher Education But these are useless unless we have the necessary VARIABLES to test them

6
Formulating Hypotheses III In Social Science we use the ‘Scientific Method’: – Formulate hypotheses and identify variables – Collect relevant data – Test hypotheses – Interpret results To formulate a hypothesis: – Reasonable justification for relationship – Past research or observation – Must be disprovable (Popper’s Falsification Theory) – Disprovable hypotheses are “not even wrong”

7
Formulating Hypotheses IV So how do we test hypotheses?... H 0 = The Null Hypothesis - No relationship exists between two variables - e.g. there is no relationship Higher Education progression and social class H 0 = The Null Hypothesis - No relationship exists between two variables - e.g. there is no relationship Higher Education progression and social class H 1 = The Alternative Hypothesis - Some relationship exists between two variables - e.g. there is a relationship between Higher Education and social class - Do not be afraid to specify the relationship with further hypotheses - It does not matter if you are wrong (in fact, that’s kind of the point) H 1 = The Alternative Hypothesis - Some relationship exists between two variables - e.g. there is a relationship between Higher Education and social class - Do not be afraid to specify the relationship with further hypotheses - It does not matter if you are wrong (in fact, that’s kind of the point)

8
Social Capital Dataset This is the dataset you will be using in seminars Few scale variables – Most variables in the dataset are categorical – The level of measurement is important! – This is typical of most datasets Scale variables are – Age (not grouped) – Years lived in area (not grouped) Weighting variable – Used to over-represent particular groups in the sample that are more frequent in the population

9
Chi-Square Test For Independence I One of many statistical tests that we can use to evaluate and thus reject or accept hypotheses Can be used to establish whether there are statistically significant relationships between two categorical variables (nominal/ordinal) e.g. Is there a statistically significant relationship between progression to Higher Education and social class? In other words, is progression to Higher Education INDEPENDENT of social class or not?

10
Chi-Square Test For Independence II The chi-square test is effectively a crosstabulation in which differences between the expected and actual values are measured Expected = the distribution of responses if there was no relationship Actual = how the responses are actually distributed A large discrepancy between the two measures may indicate disproportionality i.e. a statistically significant relationship or dependency between variables No ‘cell counts’ in the crosstabulation should be less than 5 (I’ll explain this as we go) Use row/column percentages to interpret the table (I’ll show you…)

11
Chi-Square Test For Independence III Case Processing Summary Cases ValidMissingTotal NPercentN N Social Class (employed only) * Education Level - 2000 (3 groups) 607273.9%214926.1%8221100.0% Here are the two variables that we are testing for a relationship between This is the number of cases that are ‘valid’ i.e. have useable values This is the number of cases that are ‘missing’ i.e. data not available This is the total number of cases in the dataset (including ‘valid’ and missing’) H 0 = There is no relationship between progression to Higher Education and social class H 1 = There is a relationship between progression to Higher Education and social class H 0 = There is no relationship between progression to Higher Education and social class H 1 = There is a relationship between progression to Higher Education and social class

12
Chi-Square Test For Independence IV Social Class (employed only) * Education Level - 2000 (3 groups) Crosstabulation Education Level - 2000 (3 groups) Total HIGHER EDUCAT OTHER QUALNONE Social Class (employed only) I Count284477338 Expected Count108.8144.684.6338.0 % within Social Class (employed only)84.0%13.9%2.1%100.0% II Count10625761931831 Expected Count589.5783.1458.41831.0 % within Social Class (employed only)58.0%31.5%10.5%100.0% IIIN Count2928742681434 Expected Count461.7613.3359.01434.0 % within Social Class (employed only)20.4%60.9%18.7%100.0% IIIM Count1725433961111 Expected Count357.7475.2278.11111.0 % within Social Class (employed only)15.5%48.9%35.6%100.0% IV Count125439414978 Expected Count314.9418.3244.8978.0 % within Social Class (employed only)12.8%44.9%42.3%100.0% V Count20118242380 Expected Count122.3162.595.1380.0 % within Social Class (employed only)5.3%31.1%63.7%100.0% TotalCount1955259715206072 Expected Count1955.02597.01520.06072.0 % within Social Class (employed only)32.2%42.8%25.0%100.0%

13
Chi-Square Test For Independence V So what do you think? Should we accept of reject the ‘null hypothesis’ on the basis of the evidence? Is there a relationship between progression to Higher Education and social class? Sometimes tables can be difficult to interpret if they are large or when values only change for particular groups Lucky for us there is a statistical test that will tell us whether the relationship between the variables is STATISTICALLY SIGNIFICANT Thus this test can be used to accept or reject hypotheses

14
Chi-Square Test For Independence VI The trick to interpreting the Chi-Square test (x 2 ) is to see whether the ‘Asymptotic Significance (2-sided)’ value is greater than or equal to or less than 0.05 Greater than 0.05 = not significant Equal to 0.05 = borderline significant (normally considered significant) Less than 0.05 = significant In effect we are asking the test whether social class has a significant effect on progression to Higher Education If the effect is not significant then the variables are clearly INDEPENDENT of each other (hence the name ‘Chi-Square test for independence’!) We refer to the ‘Asymptotic Significance (2-sided)’ as the ‘p-value’

15
Chi-Square Test For Independence VII Chi-Square Tests Valuedf Asymp. Sig. (2- sided) Pearson Chi-Square1915.387 a 10.000 Likelihood Ratio1884.90310.000 Linear-by-Linear Association1435.0881.000 N of Valid Cases6072 a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 84.61. The p-value (Asymp. Sig. 2-sided) is 0.000 but in reality it is never zero! We therefore present this value as p<0.05 because this is all that matters (p<0.01 is often used in the natural sciences, but you must be consistent!) The p-value (Asymp. Sig. 2-sided) is 0.000 but in reality it is never zero! We therefore present this value as p<0.05 because this is all that matters (p<0.01 is often used in the natural sciences, but you must be consistent!) There is a statistically significant relationship between progression to Higher Education and social class (x 2 = 1915.39, 10 df., p =<0.05), therefore we reject the null hypothesis… [describe the relationship from the crosstabulation]

16
Chi-Square Test For Independence VIII It is vital not just to report the statistic, but to discuss the nature of the relationship Use the percentage values in the crosstabulation to do this Whether you want row or column percentages depends on what you want to know – think carefully! Check that all cells in the crosstabulation have values in excess of 5 Is the ‘Asymptotic Significance (2-sided)’, or p-value, equal to or less than 0.05? i.e. is it significant?

17
Summary Hypotheses help us to answer the research question by dissecting it into manageable parts Hypotheses must be testable (falsifiable) and relate to the variables in your dataset Different levels of measurement require different statistical tests to check for significance For two categorical variables (including nominal and ordinal) we can use crosstabulations and the chi-square test for independence to test hypotheses and describe statistical relationships

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google