Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS130 – Software Tools Fall 2010 Statistics and PASW Wrap-up 1.

Similar presentations


Presentation on theme: "CS130 – Software Tools Fall 2010 Statistics and PASW Wrap-up 1."— Presentation transcript:

1 CS130 – Software Tools Fall 2010 Statistics and PASW Wrap-up 1

2 T-Test Fall 2010CS1302  Testing the difference between the means of two samples  If those samples are taken from the same population you would anticipate that they would be largely equal  In words, this simple test is to see if the means that are observed in the two samples is equivalent to the means we would EXPECT from the two sample  This is within a standardized error amount that you might expect from any two samples Source: geography.dur.ac.uk Remember – assumes data is taken from a normally distributed population

3 T-Test Fall 2010CS1303 The key concept here is that PASW tells you whether or not the difference between the means of whatever the two conditions or groups are, is large enough to not be by chance

4 Types of t-Tests Fall 2010CS1304  All t-tests have the principle of comparison of means as their basis  In PASW, this will explain why the menu item for all t-test is called Comparing Means  There are several variants of t-tests as you have already learn  Independent  Paired or Dependent  One-sample  There are also several “assumption” tests that can provide a check to make sure the sample data is suitable for a parametric test such as a t-test, e.g. Levene’s Test to evaluate the equal variance, we used this for our independent t-test

5 Speaking of P-Values Fall 2010CS1305  You were introduced to P- values or Sig. (2-tailed) as a method for determining when you can reject or accept the null hypothesis  However, before we wrap up the course, you should be aware of its general purpose nature  P-values use a threshold sometimes called α, alpha  We have been using 0.05

6 Speaking of P-Values Fall 2010CS1306  It is important to note that the design of the study controls the alpha, we have been using 0.05 because it is common but it can be a value based on what you are trying to do  The smaller the p-value the more evidence there is against the hypothesis (in this case our null hypothesis)  If you want an even stronger case, to reject you could insist on a threshold of 0.01 or 99% probability that the result is not by chance  However…  All p-values pertain to the probability that the means of the data are different by chance  It has nothing to do with nor does it know anything about the nature of your hypothesis

7 Speaking of P-Values Fall 2010CS1307  The Prosecutor’s Fallacy – (Shaughnessy and Chance – 2005) “The p-value is.001. This means that the chance is only 1 in 1000 that the null hypothesis is true”  It is the data in the sample that contains the probability, not the interpretation  Then that variable data is interpreted within the context of the hypothesis  The hypothesis is a statement of how might see the data based on the samples that we have collected

8 A classic example Fall 2010CS1308  You take 1 random coin out of your bank  You want to test the fairness of this one coin  You flip it 10 times in a row and you get heads every time  Null Hypothesis: The coin is fair and it flips honestly and independently  Observed data: In 10 tries all are heads  Now calculate the p-value  P(10H in 10)=P(H)xP(H)…xP(H)=( 1/2) 10 =.001  This is strong evidence that the null hypothesis can be rejected

9 Introduction to Analysis of Variance Fall 2010CS1309  And Finally, a brief introduction in another major statistical test family involving comparing an attribute of variable – this time we will look at the variance not the mean  This ANOVA or Analysis of Variance  Its here that we answer the age old question (at least a 7-week course old question)  What happens if I want to compare several independent variables to see how they interact with each other?

10 Introduction to Analysis of Variance Fall 2010CS13010  Like a t-test, there are many kinds of ANOVA methods – Factorial ANOVA, MANOVA, ANCOVA, and so on.  For this intro, we will just look at what you need to know to understand if you should consider investing time in understanding this method  The simplest ANOVA for example might be to compare the effects of caffeine on learning by using a placebo (Decaf…wow, that is mean) and a specific level of caffeinated beverage

11 Introduction to Analysis of Variance Fall 2010CS13011  How about adding more groups though as independent variables? For example the effect of caffeine and weight on learning with the control being a placebo. Now you start to leave the domain of a t-test  Analysis of Variance is just what it says, a comparison of the total variance of the data, the variance of data within each group and then a comparison of the variance of data across the groups (in our case caffeine, placebo, weight as independent, maybe test score as indicator of learning) Useless clip art, oops

12 Introduction to Analysis of Variance Fall 2010CS13012  A few terms to remember…ANOVA uses the F-ratio to determine the quality of the variances.  A high F-ratio means that there is more “planned” variance then “unplanned variance or error”  And again it has a Significance value just like our t-tests

13 Introduction to Analysis of Variance Fall 2010CS13013  One example to consider  I have created a research question…I am interested to see if job satisfaction and gender have any influence on what type of car a person might buy  More two independent factors or variables are job satisfaction and gender, my dependent variables is car category  My null hypothesis is that there is no significant relationship between the type of car I buy and my relative job satisfaction and gender

14 Introduction to Analysis of Variance Fall 2010CS13014  Of course in PASW, there is no menu pick for this factor based ANOVA, they call it the General Linear Model (GLM) with univariate. Of Course!!  Or I could use a One-Way ANOVA which is found under Comparing Mean but that does not allow for two independent variables  My data was given to me in the form of a.sav file

15 Introduction to Analysis of Variance Fall 2010CS13015  Of course in PASW, there is no menu pick for this factor based ANOVA, they call it the General Linear Model (GLM) with univariate. Of Course!!

16 Introduction to Analysis of Variance Fall 2010CS13016  The results show that in fact, there is a high degree of “similiarity” in the variance between the groups of independent variables  I see this by the F-ratios  I also see a very low Sig for all for car category which means there is no probability that the variance in the data is due to chance  Therefore, I can reject my null hypothesis and say that there is a statistically significant relationship between my gender, job satisfaction and the type of car I might purchase.

17 Introduction to Analysis of Variance Fall 2010CS13017  One final note on the introduction  This is meant to give you an additional pathway to investigate when you have a statistical project and maybe the design of experiment is slightly more complex  You will need a fair amount of study to understand the details and proper use of ANOVA and its variants (no pun intended there

18 CS130 Conclusion Fall 2010CS13018  So, this concludes our CS130 section for the Fall.  You have covered a myriad of topics and tools  Excel  Equation Editor  Word – Templates, Styles, Merge  Powerpoint – Presenting and Information Visualization (Tufte, Klass)  PASW and Statistics  All in the context of Academic Research and Design of Experiments  You should feel armed and ready to take on interesting scholarly questions and present your important work


Download ppt "CS130 – Software Tools Fall 2010 Statistics and PASW Wrap-up 1."

Similar presentations


Ads by Google