Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates.

Similar presentations


Presentation on theme: "Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates."— Presentation transcript:

1 Intro to Statistics and SPSS

2 Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates the position of a datapoint in a data set. More precisely, tells you approximately how many percent of the data is less than the datapoint. Range – difference between the maximum and minimum values in the data set 2

3  Lower quartile – or first quartile, it is the median of the data values in the lower half of a data set  Middle quartile – or second quartile, this is the overall median  Upper quartile – or third quartile, it is the median of the data values in the upper half of a data set  Quartiles may help in seeing the variation in a data set 3

4  For example (bank waiting times): 4 Big Bank: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Best Bank: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8 lower quartilemedianupper quartile Big Bank range: 11.0 – 4.1 = 6.9 Best Bank range: 7.8 – 6.6 = 1.2

5  The five number summary consists of: ◦ The minimum value ◦ The lower quartile (first quartile) ◦ The median (second quartile) ◦ The upper quartile (third quartile) ◦ The maximum value  In SPSS (was called PASW), when viewing output, first quartile is 25 th percentile, second quartile is 50 th percentile, and third quartile is 75 th percentile 5

6  Quartiles are OK for characterizing data, but standard deviation is preferred by statisticians  It is a measure of how far data values are spread around the mean of a data set  Std dev = sqrt(sum of (deviations from the mean) 2 / total number of data values – 1)  Don’t calculate by hand, use SPSS (which we’ll do in a few minutes) 6

7 A simple way to estimate standard deviation is the standard deviation estimate Divide the range by 4 Watch for outliers. They can ruin your range estimate What is an outlier? Two or more standard deviations from the mean (plus OR minus) 7

8 Go back to Big Bank / Best Bank example Big Bank: range = 6.9 6.9 / 4 = 1.7 Actual standard deviation is 1.96 Best Bank: range = 1.2 1.2 / 4 = 0.3 Actual standard deviation is 0.44 Any outliers? Means are 7.2 and 6.7 Big Bank: 4.1 5.2 5.6 6.2 6.7 7.2 7.7 7.7 8.5 9.3 11.0 Best Bank: 6.6 6.7 6.7 6.9 7.1 7.2 7.3 7.4 7.7 7.8 7.8 8

9  Nice way to view a data set  A histogram is a chart similar to a dotplot created by defining a set of bins and counting how many data points lie in each bin. Bars are drawn with height proportional to the number of data points in each bin. 9

10 10

11  While Excel can do some basic statistics, it is not considered a serious statistics tool  You really should use something like SPSS or SAS  We’ll use SPSS since DePaul has a site license 11

12 Copy the dataset Grades.xls from the QRC website (OlderData) to My Documents and start SPSS (or try the file IncomeGaps.xls) Using SPSS, open the Grades.xls spreadsheet Change the variable names and make sure the data is numeric, not text Click on Analyze -> Descriptive Statistics -> Frequencies 12

13 13  Be careful! If the numeric fields in the dataset have any $, % or #, SPSS will have difficulty converting these to numeric  In particular, if the data has dollar signs, have SPSS first convert the field to Dollar, then convert it to Numeric (IncomeGaps.xls) Let’s Try An Example

14 14  Using the grades for Exam 2, find the ◦ 5 number summary (minimum, 1 st quartile, median, 3 rd quartile, maximum) ◦ Mean ◦ Range ◦ What is the standard deviation? Let’s Try An Example

15  Let’s say you have just performed a survey.  One of the questions you ask is, what type of home computer Internet connection do you have?  Answers can be: none, dial-up, dsl, cable, other, not sure. 15

16  Here are some of your results 16 Respondent IDCable Type 11111 no 11112 ds 11113 cm 11114 dk 11115 du 11116 du Where no = none; ds = dsl; cm = cable modem; du = dial up; dk = don’t know; ot = other

17 You can use SPSS to count the occurrences of data items, just like a pivot table Enter your data into SPSS Click on Analyze / Descriptive Statistics / Frequencies Move the variable that you want to count from the left box to the right box Make sure Display Frequencies Table is checked Run it 17

18  Crosstabs are an extension of pivot tables  Let’s say you have asked a number of students: How many schools did you apply to?  You get results something like the following (in a spreadsheet): 18

19 19 Respondent IDSexNumber Schools 1 F2 2 M6 3 F1 4 F4 5 M9 6 M10 7 F3 8 F2 9 F7 10 M5

20  Now open the data in SPSS  Then pull down the menu Analyze and click on Descriptive Statistics, then Crosstabs  What variable do you want in the row? The column?  When ready, click OK to perform the crosstab.  Let’s do the activity. 20


Download ppt "Intro to Statistics and SPSS. Mean (average) Median – the middle score (even number of scores or odd number of scores) Percent Rank (percentile) – calculates."

Similar presentations


Ads by Google