Presentation is loading. Please wait.

Presentation is loading. Please wait.

APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.

Similar presentations


Presentation on theme: "APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical."— Presentation transcript:

1 APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical analysis? How are missing data treated in statistical analyses? When is it appropriate to delete data before they are analyzed? What are descriptive statistics and inferential statistics? What determines how well the data in a sample can be used to predict population parameters?

2 Preparing Data for Analysis Collecting the data Analyzing the data Entering the data into the computer 1. Ask participants to fill out a questionnaire 2. Ask participants to enter their response via keyboard into a computer. 1. SPSS contains a spreadsheet data editor,a output editor, and a syntax editor. 2. SPSS contains subprogram to compute the statistical analyses such as Frequency Distribution, Descriptive statistics, ANOVA, Correlation, Regression 1. Use coding systems> Label variables. 2. Keep notes> You will forget which variable name refers to which data 3. Save and back up the data 4. Check and clean the data

3 Missing Data When the respondent has decided not to answer a question because it is inappropriate or because the respondent has personal reasons for not doing so. 1. Think carefully about whether all questions are appropriate 2. Save respondents from embarrassing situations. When the respondent forgot to answer the question or completely missed an entire page of the questionnaire. 1. Test the research procedure before you carry out it 2. Check the respondents answers before they leaves When the research requires the respondents to participate in it at more than one time. Attrition Problem

4 Deleting and Retaining Data When do we delete variables? When do we delete responses? How do we trim the data? When do we delete participants? When do we transform the data? Cases in which the reliability analysis indicates that the variable did not measure the same things that other variable measured. Cases in which the respondents gave a very extreme score>outlier Cases in which the respondents did not understand the instruction or wasn’t able to perform the task Cases in which the scores that are more than 3 standard deviation above or below the variable’s mean. Cases in which you use reverse-score, or you have skewed data

5 Conducting Statistical Analysis Descriptive Statistics Inferential Statistics Statistical approach in which the researcher summarize the pattern of scores observed on a measured variable. Statistical approach in which the researcher infers statistical significance in total population based on the pattern of scores observed in your sample of respondents Your Data Analysis Your Data Population Analysis

6 Summation Notation X 1 = 6 X 2 = 5 X 3 = 2 X 4 = 7 X 5 = 3 Sample data = 6 + 5 + 2 + 7 + 3 = 23 Summation Starts from 1To N (in this case, N = 5)

7 Rounding APA Publication manual generally suggests to round the presented figures (including both descriptive and inferential statistics) to two decimal places.  = 3.14159265…… 3.14 = 1.732...1.73 p = 0.0041....004

8 Computing Descriptive Statistics Central Tendency Frequency Distribution: Dispersion: A table that indicates how many, and in most cases what percentage, of individual in the sample fall into each of a set of categories. (e.g. bar chart, grouped frequency distribution, histogram, frequency curve, stem and leaf plot) The point in the distribution around which the data are centered. (e.g. mean, median, mode) The extent to which the scores are all tightly clustered around the central tendency (e.g. range, variance, standard deviation)

9 Frequency Distribution X 1 = 6 X 2 = 5 X 3 = 2 X 4 = 7 X 5 = 3 X 6 = 4 X 7 = 6 X 8 = 2 X 9 = 1 X 10 = 8 Bar Chart Histogram Frequency Curve

10 Central Tendency Sample Data X 1 = 6 X 2 = 5 X 3 = 2 X 4 = 7 X 5 = 3 X 6 = 4 X 7 = 6 X 8 = 2 X 9 = 1 X 10 = 8 The Mean (average): the value in which the sum of all of the scores devided by the sample size. == The Median: The score at which half of the observations are greater and half are smaller. The Mode: the most frequently occurring value in a variable. 1, 2, 2, 3, 4, 5, 6, 6, 7, 8 = 4.5 1, 2, 2, 3, 4, 5, 6, 6, 7, 8 == 4.4

11 Dispersion The Range The variance The Distance between the largest (the maximum) and the smallest (the minimum) observed values of the variable. The Standard Deviation The sum of squares ( sum of (Xi - mean) 2 )divided by N = S 2 = S The square root of the variance

12 The variance and the Standard Deviation Mean Deviation Score X 1 = 6 X 2 = 5 X 3 = 2 X 4 = 7 X 5 = 3 X 6 = 4 X 7 = 6 X 8 = 2 X 9 = 1 X 10 = 8 = 4.4 1 2 3 4 5 6 7 8 (6 - 4.4) (5 - 4.4) (2 - 4.4) (7 - 4.4) (3 - 4.4) (4 - 4.4) (6 - 4.4) (2 - 4.4) (1 - 4.4) (8 - 4.4) = 0

13 Sum of Squares (6 - 4.4) 2 = 2.56 (5 - 4.4) 2 = 0.36 (2 - 4.4) 2 = 5.76 (7 - 4.4) 2 = 6.76 (3 - 4.4) 2 = 1.96 (4 - 4.4) 2 = 0.16 (6 - 4.4) 2 = 2.56 (2 - 4.4) 2 = 5.76 (1 - 4.4) 2 = 11.56 (8 - 4.4) 2 = 12.96 SS = = 50.4 SS = = - 244 -

14 Variance and Standard Deviation Variance Standard Deviation S2 =S2 = = 5.04 S == = = 2.24 (SD)

15 Standard Score (Z score) To compare two scores that have different mean and different standard deviation (SD). Taro had received a score of 80 on a test. The average was 50, and standard deviation was 15. Susan had received a score of 75 on a test. The average was 60, and standard deviation was 10. Z = Z Taro = Z Susan = = 2.0 = 1.5 The distance of a score from the mean of the variable expressed in standard deviation unit. 50 80 60 75 0 1.5 2.0

16 Standard Nominal Distribution Hypothetical population distribution of standard scores when the original scores are normally distributed.  = 0,  = 1 -1 < Z < 0, or 0 < Z < 1 -2 < Z < -1, or 1< Z < 2 -3 < Z < -2, or 2 < Z < 3 13.59 % 34.13% 2.15% Z > -3, or 3 < Z 0.13%

17 Working with Inferential Statistics Example. A researcher estimate the average GPA of all of the psychology majors at UM. M M M M M W W W W W W W W W W W W W W Population Descriptive Statistics of 100 students. = 3.40 S = 2.23  Mean of the population Standard deviation of the population Mean of the sampleStandard deviation of the sample

18 Unbiased Estimator The sample mean ( ) is an unbiased estimator of the population mean . The sample standard deviation ( s ), however, is not an unbiased estimator of the population standard deviation . How can we estimate , using the sample standard deviation? S = ^

19 The standard error If we take all possible samples of N = 100 from a given population, the resulting distribution of the sample means have =  The distribution would be normally distributed with a standard deviation known as standard error of mean (or simply the standard error). The standard error is symbolized as S S =

20 Confidence Intervals The range of scores within which the population mean is likely to fall. The exact width of the confidence interval is determined with a statistic known as Student’s t If we set alpha =.05, Example. Now, we sampled 100 students. Degree of freedom = 100 - 1 = 99 The appropriate t value = 1.99 (see Table C, Appendix E) Lower limit  = - t(s ) = 3.40 - 1.99 (.22) = 2.96 Upper limit  = - t(s ) = 3.40 +1.99 (.22) = 3.84


Download ppt "APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical."

Similar presentations


Ads by Google