Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Significance of Data

Similar presentations


Presentation on theme: "Statistical Significance of Data"— Presentation transcript:

1 Statistical Significance of Data

2 Box and Whisker Plot …can be useful when dealing with many data values. Rather than showing all of the data, it selects five statistics. Five-number summary is another name for the visual representations of the box-and-whisker plot. The five statistics consist of the: • Median • Quartiles (lower and upper) • Minimum • Maximum

3 Make a Box and Whisker Plot from these numbers:
1. Put the numbers in numerical order. 2. Find the Minimum (smallest value of the entire set) 3. Find the Maximum (largest value of the entire set) 4. Find the Median number (the number in the middle of the ordered set of numbers). 5. Lower Quartile (Q1) - numbers to the left of the median, find its median. 6. Upper Quartile (Q3) - numbers to the right of the median, find its median. 7. Draw a box to represent the IQR (interquartile range) and solve: (Q3 – Q1) = IQR * If you are finding the median of an even set of numbers - find the two middle numbers,    add them together and divide by two to get the median. 10 20 30 40 50 60 70 80 90 100

4 Make a Box and Whisker Plot:
1. Put the numbers in numerical order 2. Find the Minimum (smallest value of the entire set) 18 3. Find the Maximum (largest value of the entire set) 100 4. Find the Median number (the number in the middle of the ordered set of numbers). 68 5. Lower Quartile (Q1) - # to the left of the median, find its median /2 = 53 6. Upper Quartile (Q3) - # to the right of the median, find its median /2 = 86 7. Draw a box to represent the IQR (interquartile range) and solve: (Q3 - QL) = IQR = 33 10 20 30 40 50 60 70 80 90 100

5 * Make a Box and Whisker Plot: 46 39 10 48 46 45 51 42 49
1. Put the numbers in numerical order – 51 2. Find the Minimum (smallest value of the entire set) 10 3. Find the Maximum (largest value of the entire set) 51 4. Find the Median number (the number in the middle of the ordered set of numbers). 46 5. Lower Quartile (Q1) - # to the left of the median, find its median /2 = 40.5 6. Upper Quartile (Q3) - # to the right of the median, find its median = 48.5 7. Draw a box to represent the IQR (interquartile range) and solve: (Q3 – Q1) = IQR 68.5 – 40.5 = 8 8. Is 10 an Outlier that should be ignored? Multiply 1.5 (IQR) = 1.5 x 8 = 12; then Q1 – 12 = 40.5 – 12 = 28.5 (10 is well below 28.5 = an outlier) Is 51 an outlier? Multiply 1.5 (IQR) = 1.5 x 8 = 12; then Q3 – 12 = = 60.5 (51 is within that range = NOT an outlier) 39 For the upper and lower quartile, include the minimum and maximum, but do not include the actual median So change the minimum to the next smallest number (39) and draw the whisker * 10 20 30 40 50 60

6 Dot Plot …can be useful when trying to find patterns, trends, or clusters of data. In some cases it may show a discrepancy in data that could be due to unavoidable error or avoidable/user error.

7 Using this data, create a: 1. Box and Whisker Plot 2. Dot Plot

8 Box and Whisker Dot Plot
1) 2) Median = / 2 = 7 grams of sugar/serving 3) Lowest = 1 gram of sugar/serving 4) Highest = 22 grams of sugar/serving 5) Lower Quartile = 4+5/2 = 4.5 grams of sugar/serving 6) Upper Quartile = 10+11/2 = 10.5 grams of sugar/serving Dot Plot 5 10 15 20 25

9 Chi-Squared Introduction:
The Chi Square test (X2) is often used in science to test if data you observe from an experiment is the same as the data you would expect from the experiment. Calculating X2 values allow you to determine if test results can be attributed to randomness or not. If the data differs greatly and is not due to randomness, other factors must be influencing your results. Objectives: • Determine the degrees of freedom (df) for this investigation (category or class number -1) = n-1. • Calculate the X 2 value for a given set of data X2 = ∑ (observed value – expected value) expected value • Use the Chi Square Table to determine if the calculated value is equal to or less than the critical value. • Determine if the Chi Square value exceeds the critical value & if the null hypothesis is accepted or rejected. Biologists generally use a Probability value of 0.05 (p = 0.05) in a Chi Square Table. That p-value means the probability of a random error would be fewer than 1 time in 20, thus the value (p = 0.05). This is like saying you are 95% certain that the results are a due to random chance. Degree of Freedom is the number of choices(n) minus 1 df = n – 1 The P-Value & the Degree of Freedom are used to determine the Critical Value If the chi square value ≤ the critical value the Null Hypothesis is accepted as statistically reasonable. If the chi square value is > the critical value, then it is seen as a “statistically significant” difference – meaning that the validity of the hypothesis would be under question, suggesting that the results are “unlikely to have occurred by chance,” thus rejecting the Null Hypothesis.

10 Difference Between Null Hypothesis and Experimental Hypothesis:
The null hypothesis is the hypothesis that the dependent variable in an experiment is not affected by the independent variable. The experimental hypothesis is that the dependent variable is affected by the independent variable.  For example, let's say that you are testing whether playing violent video games affect aggressiveness. The playing or not playing of video games is the independent variable, and the aggressiveness is the dependent variable.  1) Your experimental hypothesis could be that playing violent video games affects aggressiveness in the test subjects.  The null hypothesis is that the violent video games do not in any way affect aggressiveness.  2) Alternatively, you could make an experimental hypothesis saying that playing violent video games make people more aggressive, in which case the null hypothesis would be that it does not make people more aggressive. In this case you'd get a directional null hypothesis, e.g. that there is either no effect or the effect is the opposite of what you expect. Question: Is there a statistically significant difference in the data?

11 Scenario 1: While reviewing zoo records, a zookeeper notices that the baboon exhibits each average 42 incidences of aggressive behavior a month. He hypothesizes that changing the intensity of the light in the primate exhibit will reduce the amount of aggression between the baboons. In exhibit A, with a lower light intensity, he observed 32 incidences of aggression over a one month period. In exhibit B, with normal lights, he observes 45 incidences of aggression. Would you accept or reject his experimental hypothesis? Exhibit A (32 – 42)2 = = 2.38 = 2.59 (45 – 42)2 = = 0.21 Exhibit B P-value = 0.05 Degree of Freedom (df) = number of choices – 1 (n is aggressive or not aggressive ) 2-1 = 1 df Critical Value = 3.84 Accept or Reject his experimental hypothesis?

12 Scenario 2: A behavioral psychologist notices that gate #4 in the polar bear exhibit is preferred over gate 1, 2 and 3. If there were no impetus for the polar bears to go through that particular gate, then one would expect each gate to be used equally, 25% of the time. However, based on her observations, the polar bears entered gate 1 (9%), gate 2 (20%), gate 3 (25%), and gate 4 (46%). She believes there is something making them select that door in such great numbers. Would you accept or reject the null hypothesis (that nothing is impacting their decision)? (9 – 25)2 = = 10.24 Gate 1 (20 – 25)2 = 25 = 1.00 Gate 2 = 28.88 (25 – 25)2 = 0 = 0.00 Gate 3 (46 – 25)2 = = 17.64 Gate 4 P-value = 0.05 Degree of Freedom (df) = number of choices – 1 (n is 4 gates) 4-1 = 3 df Critical Value = 7.82 Accept or Reject the null hypothesis?

13 Statistics Worksheet N: Total number of individuals in a population
Mean ( ) : (same as the average) add up the values / number of trials Sum of the Squares (SS) Variance (s2) Standard Deviation (s) Standard Error of the Mean SS = N: Total number of individuals in a population n: Total number of individuals in a sample of the population Xi: a single measurement ∑: Summation x: sample mean

14 Data of Non-Survivors (xi - 1)2 2s √n Sample # Non-Survivor
Beak Depth (mm) X1 measurement Squared Difference (xi )2 1 7.52 2 9.31 3 8.20 4 8.39 5 10.50 Mean = (total/sample #) Sum of Squares Variance Standard deviation Standard error of the mean 95% CI Data of Non-Survivors Data on 100 medium ground finches from Peter and Rosemary Grant’s 40 years of Research in the Galápagos 43.92/5 = 8.78 SS = SS = xxxx S2 = xxxx S = xxxx SE = xxxx 2s √n CI = xxxx n = 5 samples

15 Data of Non-Survivors (xi - 2)2 2s √n Sample # Non-Survivor
Beak Depth (mm) X1 measurement Squared Difference (xi )2 1 9.10 2 8.80 3 9.15 4 11.01 5 10.86 Mean = (total/sample #) Sum of Squares Variance Standard deviation Standard error of the mean 95% CI Data of Non-Survivors Data on 100 medium ground finches from Peter and Rosemary Grant’s 40 years of Research in the Galápagos 48.92/5 = 9.78 SS = SS = xxxx S2 = xxxx S = xxxx SE = xxxx 2s √n CI = xxxx n = 5 samples

16 t–Test Statistics The t-Test determines the probability (p) that any observed differences between the means of the two samples (i.e. non-survivors and survivors) occurred simply by chance, and not natural selection. | | = absolute value, always a positive number n = 5 birds


Download ppt "Statistical Significance of Data"

Similar presentations


Ads by Google