Presentation is loading. Please wait.

Presentation is loading. Please wait.

STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of.

Similar presentations


Presentation on theme: "STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of."— Presentation transcript:

1 STATISTICS Advanced Higher Chi-squared test

2 Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of data. Lesson Objectives 1. Explain why it is used. 2. List the advantages and disadvantages. 3. Understand how to apply the statistical test. 4. Apply it to a relevant context.

3 Advanced Higher STATISTICS Chi-squared test: looking for a difference The situation A group of students have visited the Lake District National Park to investigate the impact of tourism upon the landscape. One of their data collection techniques is to record the amount of traditional and modern looking houses in 8 villages inside of the National Park boundary and 8 villages outside of the boundary line... What should they do? What data should they have collected to complete this investigation? How much data should they collect? How can they make sure that the data is reliable? What initial data representation skill could they utilise to discover an initial impression? What statistical test should they use to confidently state there is or is not a relationship?

4 -What did you observe? (what data did you actually collect?) - What would you expect if there was no association? Chi-squared test: looking for a difference O = the Observed frequency (what you actually counted) E = the E xpected frequency (what you would expect if there was no association) (O-E) 2 E X 2 =  Traditional houses Modern houses We found 180 traditional homes inside of the National Park and 23 outside. We found 103 modern homes inside of the National Park and 452 outside.

5 Null Hypothesis: There is no significant difference between building ages inside and outside of the National Park Alternative Hypothesis: There is a significant difference between building ages inside and outside of the National Park TESTING THE RELATIONSHIP (O-E) 2 E X 2 =  OBSERVED FREQUENCIES Inside the national park Outside the national park row total Traditional houses18023203 Modern houses103452555 Column total283475753 1 st : construct a table with the data that you have observed EXPECTED FREQUENCIES Inside the national park Outside the national park row total Traditional houses40.7128203 Modern houses111350555 Column total151478753 2 nd : work out the expected frequency Expected Frequency = row total x column total Grand total (O-E) 2 E X 2 =  (O-E) 2 E X 2 = 

6 Null Hypothesis: There is no significant difference between building ages inside and outside of the National Park Alternative Hypothesis: There is a significant difference between building ages inside and outside of the National Park TESTING THE RELATIONSHIP Degrees of Freedom 0.05 (95%)0.01 (99%) 13.846.64 25.999.21 37.8211.34 49.4913.28 511.0815.09 612.5916.81 714.0718.48 815.5120.09 916.9221.67 1018.3123.21 1119.6824.72 1221.0326.22 1322.3627.69 1423.6829.14 1525.0030.58 1626.3032.00 1727.5933.41 1828.8734.80 1930.1436.19 2037.57 3043.7750.89 FINAL STATEMENT IF X 2 IS HIGHER THAN OR EQUAL TO THE CRITICAL VALUE REJECT THE NULL HYPOTHESIS AND ACCEPT THE ALTERNATIVE. As X 2 is (greater than / less than) the Critical Value I can (accept / reject) the Null Hypothesis and (accept / reject) the Alternative Hypothesis. Therefore I can state that there (is no / is a) significant association… …to a significance level of 0.05 (95% sure results have not occurred by chance). CALCULATE THE DEGREES OF FREEDOM: (Number of Rows – 1) x (Number of Columns – 1) Chi2 value of ____ is higher than3.84 and 6.64 so…

7 Reasons to use it It allows you to identify if there is a difference or a relationship between two characteristics. It is simple to carry out It compares the data that you have observed with what you would expect to happen. Disadvantages of using it The data must be in the form of frequencies. The frequency of the data must have a precise numerical value and be able to be organised into categories or groups. The total number of observations must be more than 20. The expected frequency in any one cell of the table must be more than 5. There is a significant association between housing age inside and outside of the Lake District National Park. State the answer in terms of the alternative hypothesis. Sometimes buildings are built recently but designed to look old. The survey may have included unused farm buildings as traditional but not necessarily used as homes. It is uncertain how the survey determined what was modern or traditional. The survey indicates that villages inside of the Park are smaller. Perhaps there is a static village size and new buildings aren’t being built. Referring to a National park that you have studied, comment on the results shown in this test. Justify the suitability of using chi 2 test.

8 You compare the observed data with the data that you would expect. Looking for a difference between O & E. If there is a difference, then there is an association! Reason to use this test: If you have categorical data (eg. blue eyes) means are not a category. Colours, for example, are. Must have: More than one category A minimum of 5 in each one


Download ppt "STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of."

Similar presentations


Ads by Google