Chi Square Test X2.

Presentation on theme: "Chi Square Test X2."— Presentation transcript:

Chi Square Test X2

Chi Square is a test used to see if two pieces of data are significantly different or due to chance
In Biology, we use this test a lot to see if our data is significant. In a population lab, we would see if 2 species found are associated with each other

Quadrant Sampling This method is only suitable for plants and other organisms that are not motile. Choose a random number to determine the length and width of your area If the absence or presence of more than one species is recorded in every quadrat during sampling of a habitat, it is possible to test for an association between species.

A quadrat is a wire shaped into a square of a known size , such as 10x10 meters or 100m2.
If you want to know the population size of two plant species, take random samples of this area by throwing down the quadrat and recording the population numbers in each subunit of the quadrat.

This is a grid of 100 quadrats, each 10 m on a side. Dr.Rodrigue Environmental Systems

Count how many individuals there are inside the quadrat of the plant population being studied. Repeat steps 2 and 3 as many times as possible. Measure the total size of the area occupied by the population, in square meters. Survey of native and exotic species

Calculate the mean number of plants per quadrat
Calculate the mean number of plants per quadrat. Then calculate the estimated population size using the following equation: Population size = mean number per quadrat X total area area of each quadrat

Populations are often unevenly distributed because some parts of the habitat are more suitable for a species than others. If two species occur in the same parts of a habitat, they will tend to be found in the same quadrats. This is known as a Positive association

H0 -Two species are distributed independently
There are 2 hypotheses: H0 -Two species are distributed independently The Null Hypothesis H1 – Two species are associated (either positively so they tend to occur together or negatively so they tend to occur apart) We can test these hypotheses using a statistical procedure – the chi square test

Method for Chi Square Draw up a contingency table of observed frequencies. Species A present Species A absent Row totals Species B present Species B absent Column totals

Calculate the row and column totals.
Adding the row and column totals should give the same grand total in the lower right cell.

Calculate the expected frequencies, assuming the independent distribution for each of the four species combinations. Each expected frequency is calculated from values on the contingency table using this equation. Expected frequency = row total x column total grand total

Calculate the degree’s of freedom using this equation: DF = (m-1)(n-1)
Where m and n are the number of rows and columns in the contingency table

Find the critical region for chi-squared from a table of chi-square values, using the degrees of freedom that you calculated. It should have a significance level (p) of 0.05 (5%)

X2 = Σ 𝑜−𝑒 2 e O – observed e- expected Σ- the sum of
Calculate the chi-squared using this equation: X2 = Σ 𝑜−𝑒 2 e O – observed e- expected Σ- the sum of

What is statistically significant
H0 - the null hypothesis with the belief that there is no relationship between the two H1 – There is a relationship The usual procedure is to test the null hypothesis with the expectation of showing that it is false. If you say that the results were statistically significant, it means that if the null hypothesis was true, the probability of getting results as extreme as the oberved results would be very small.

example In a certain town, there are about one million eligible voters. A simple random sample of 10,000 eligible voters were chosen to study the relationship between sex and participation in the last election. The results are summarized in the following 2x2 contingency table: Men Women Voted 2792 3591 Didn’t vote 1486 2131

We want to check whether being a man or a woman (columns) is independent of having voted in the last election (rows). In other words is ‘sex and voting independent’? Null – sex is independent of voting Alternative – sex and voting are dependent

We now need to complete our contingency table.
Men Women Total voted 2792 3591 6383 Didn’t vote 1486 5722 3617 4278 10000

Expected Table Men Women Total Voted 2731 3652 6383 Didn’t vote 1547
2070 3617 Totals 4278 5722 10000 Remember: expected frequencies = row totals x column totals grand total So: 6383 x 4278 / = 2731 6383 x 5722 / = 3652

Now we have the observed table and the expected table under the null hypothesis of independence. Now we need to compute X2 (O – e)2 e

So….. (2792 – 2731)2 = 1.36 2731 (3591 – 3652) = 1.0 3652 Etc . X2 = = 6.6 Degrees of freedom 2 – 1 = 1

Since X2 is 6.6 which has a p value of 1%, we have to reject the NULL hypothesis. The data supports the hypothesis that sex and voting are dependent in this town.