Download presentation
Presentation is loading. Please wait.
Published byOpal Harper Modified over 8 years ago
1
Homework #1 is due now Today: T-test and Outliers xkcd.com
3
Stats practice in next lab Also need to start putting together your group for inquiry 2... 3-5 people/group Inquiry 1 written and oral reports are due in lab Th 9/24 or M 9/28 Homework #2 is posted... Stream Open House Homework #3 will ask you to complete some online safety training Online evaluation More TA office hours
4
How significant of a difference is this? Set 1= 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3.67 ±1.6 range = 2.07 to 5.27 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7.25 ± 1.48 range = 5.77 to 8.73 No overlap, might be different
5
The ‘Students’ T-test is a method to assign a numerical value of statistical difference.
6
X 1 –X 2 T= n1n1 Sx 1 22 + n2n2 Sx 2
7
The ‘Students’ T-test is a method to assign a numerical value of statistical difference. X 1 –X 2 T= n1n1 Sx 1 22 + n2n2 Sx 2 (Difference between means) (variance) (sample size)
8
The ‘Students’ T-test is a method to assign a numerical value of statistical difference. X 1 –X 2 T= n1n1 Sx 1 22 + n2n2 Sx 2 T is then used to look up the P-value from a table. Also need ‘degrees of freedom’ = (n 1 +n 2 )-1.
9
P-value Df0.050.020.01 112.7131.8263.66 24.3036.9659.925 33.1824.5415.841 42.7763.7474.604 52.5713.3654.032 62.4473.1433.707 72.3652.9983.499 82.3062.8963.355 92.2622.8213.250 102.2282.7643.169 T Partial table for determining P from T
10
How significant of a difference is this? Using a speadsheet to get a P value = 3.44x10 -6. Set 1= 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3.67 ±1.6 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7.25 ± 1.48
11
How significant of a difference is this? P value = 3.44x10 -6. So the chance that these 2 sets of data are not significantly different is 3.44x10 -6 Set 1= 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3.67 ±1.6 And Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7.25 ± 1.48
12
How significant of a difference is this? P value = 3.44x10 -6. So the chance that these 2 sets of data are significantly different is 1 - 3.44x10 -6 or 0.999996559 We can be 99.9996559% certain that the difference is statistically significant. Set 1= 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3.67 ±1.6 Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 9, 5 Mean = 7.25 ± 1.48
13
Overlap, different means, but might not be a statistically significant difference Set 1= 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7 Mean = 3.67 ±1.6 range = 2.07 to 5.27 Set 2= 8, 6, 7, 8, 9, 5, 6, 7, 9, 8, 4, 5 Mean = 6.83 ± 1.64 range = 5.19 to 8.47 P-value = 4.41 x 10 -5
14
What is, or is not, a statistically significant difference? 20% random difference : 80% confidence 10% random difference : 90% confidence 5% random difference : 95% confidence 1% random difference : 99% confidence 0.1% random difference : 99.9% confidence
15
Generally a P-value of 0.05 or less is considered a statistically significant difference. 20% random difference : 80% confidence 10% random difference : 90% confidence 5% random difference : 95% confidence 1% random difference : 99% confidence 0.1% random difference : 99.9% confidence
16
Standard deviation is NOT a valid method for determining statistical signifigance. T-test is one valid and accurate method for determining if 2 means have a statistically significant difference, or if the difference is merely by chance.
17
Outliers… 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 7, 121, 130 Median = 4 Mean = 18
18
Outliers: When is data invalid?
19
Not simply when you want it to be.
20
Outliers: When is data invalid? Not simply when you want it to be. Dixon’s Q test can determine if a value is statistically an outlier.
21
|(suspect value – nearest value)| Q = |(largest value – smallest value)|
22
Dixon’s Q test can determine if a value is statistically an outlier. |(suspect value – nearest value)| Q = |(largest value – smallest value)| Example: results from a blood test… 789, 700, 772, 766, 777
23
Dixon’s Q test can determine if a value is statistically an outlier. |(suspect value – nearest value)| Q = |(largest value – smallest value)| Example: results from a blood test… 789, 700, 772, 766, 777
24
Dixon’s Q test can determine if a value is statistically an outlier. |(suspect value – nearest value)| Q = |(largest value – smallest value)| Example: results from a blood test… 789, 700, 772, 766, 777 Q=|(700 – 766)| ÷ |(789 – 700)|
25
Dixon’s Q test can determine if a value is statistically an outlier. |(suspect value – nearest value)| Q = |(largest value – smallest value)| Example: results from a blood test… 789, 700, 772, 766, 777 Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742
26
Dixon’s Q test can determine if a value is statistically an outlier. |(suspect value – nearest value)| Q = |(largest value – smallest value)| Example: results from a blood test… 789, 700, 772, 766, 777 Q =|(700 – 766)| ÷ |(789 – 700)| = 0.742 So?
27
You need the critical values for Q table: Sample #Q critical value 30.970 40.831 50.717 60.621 70.568 100.466 120.426 150.384 200.342 250.317 300.298 If Q calc > Q crit rejected From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958)
28
You need the critical values for Q table: If Q calc > Q crit than the outlier can be rejected Q calc = 0.742 Q crit = 0.717 = rejection From: E.P. King, J. Am. Statist. Assoc. 48: 531 (1958) Sample #Q critical value 30.970 40.831 50.717 60.621 70.568 100.466 120.426 150.384 200.342 250.317 300.298
29
What can outliers tell us?
30
If you made a mistake, you should have already accounted for that.
31
Outliers can lead to important and fascinating discoveries. Transposons “jumping genes” were discovered because they did not fit known modes of inheritance.
32
Homework #1 is due now and homework #2 is posted Next: R 2, Samples, and Populations xkcd.com
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.