# Copyright © Allyn & Bacon (2010) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents.

## Presentation on theme: "Copyright © Allyn & Bacon (2010) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents."— Presentation transcript:

Copyright © Allyn & Bacon (2010) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents are protected under copyright law. The following are prohibited by law: (1) Any public performance or display, including transmission of any image over a network; (2) Preparation of any derivative work, including the extraction, in whole or in part, of any images; (3) Any rental, lease, or lending of the program.

Copyright © Allyn & Bacon (2010) Individual Differences A fact of life A fact of life –People differ from one another –People differ from one occasion to another Most psychological variables have small effects compared to individual differences Most psychological variables have small effects compared to individual differences Statistics give us a way to detect such subtle effects Statistics give us a way to detect such subtle effects

Copyright © Allyn & Bacon (2010) Descriptive Statistics Are used to describe the data Are used to describe the data Many types of descriptive statistics Many types of descriptive statistics –Frequency distributions –Summary measures –Graphical representations of the data A way to visualize the data A way to visualize the data The first step in any statistical analysis The first step in any statistical analysis

Copyright © Allyn & Bacon (2010) Frequency Distributions First step in organization of data First step in organization of data –Can see how the scores are distributed Used with all types of data Used with all types of data Illustrate relationships among variables in a cross-tabulation Illustrate relationships among variables in a cross-tabulation Simplify distributions by using a grouped frequency distribution Simplify distributions by using a grouped frequency distribution

Copyright © Allyn & Bacon (2010) Creating Frequency Distributions Create a column with all possible scores Create a column with all possible scores Count the number of people with each score Count the number of people with each score –Some frequencies may be zero (no one had that score) Can only do a frequency distribution if: Can only do a frequency distribution if: –The scores are not continuous –The range of scores is not too large (becomes unwieldy)

Copyright © Allyn & Bacon (2010) Creating a Grouped Frequency Distribution Start by creating about 10-15 equal sized intervals that are sufficient to cover the range of scores Start by creating about 10-15 equal sized intervals that are sufficient to cover the range of scores Count the number of people in each interval Count the number of people in each interval Necessary whenever the distribution is continuous Necessary whenever the distribution is continuous Useful when the range of scores is large Useful when the range of scores is large

Copyright © Allyn & Bacon (2010) Cross-Tabulation A way to see the relationship between two nominal or ordinal variables A way to see the relationship between two nominal or ordinal variables –When done with score data, it is usually done as a scatter plot (covered later) Create a set of cells by listing the values of one variable as columns and the values of the other as rows Create a set of cells by listing the values of one variable as columns and the values of the other as rows

Copyright © Allyn & Bacon (2010) Cross-Tabulation Example MalesFemalesTotal Democrats459 Republicans617 Other718 Total17724

Copyright © Allyn & Bacon (2010) Graphing Data Visual displays are often easier to comprehend Visual displays are often easier to comprehend Two types of graphs covered here Two types of graphs covered here –Histograms –Frequency Polygons

Copyright © Allyn & Bacon (2010) Histograms A bar graph, as shown at the right A bar graph, as shown at the right Can be used to graph either Can be used to graph either –Data representing discrete categories –Data representing scores from a continuous variable

Copyright © Allyn & Bacon (2010) Graphing 2 Distributions Possible to graph two or more distributions to see how they compare Possible to graph two or more distributions to see how they compare Note that one of the two groups in this histogram was the same group graphed previously Note that one of the two groups in this histogram was the same group graphed previously

Copyright © Allyn & Bacon (2010) Frequency Polygon Like a histogram except that the frequency is shown with a dot above the score, with the dots connected Like a histogram except that the frequency is shown with a dot above the score, with the dots connected

Copyright © Allyn & Bacon (2010) Two Frequency Polygons Can compare two of more frequency polygons on the same scale Can compare two of more frequency polygons on the same scale Easier to compare groups because the graph appears less cluttered than multiple histograms Easier to compare groups because the graph appears less cluttered than multiple histograms

Copyright © Allyn & Bacon (2010) Shapes of Distributions Many psychological variables are distributed normally Many psychological variables are distributed normally The distribution is skewed if scores bunch up at one end The distribution is skewed if scores bunch up at one end

Copyright © Allyn & Bacon (2010) Measures of Central Tendency Mode: the most frequently occurring score Mode: the most frequently occurring score –Easy to compute from frequency distribution Median: the middle score in a distribution Median: the middle score in a distribution –Less affected than the mean by a few deviant scores Mean: the arithmetic average Mean: the arithmetic average –Most commonly used central tendency measure –Used in later inferential statistics

Copyright © Allyn & Bacon (2010) Finding the Mode Easiest way to find the mode is to construct a frequency distribution first Easiest way to find the mode is to construct a frequency distribution first Find the score with the largest frequency Find the score with the largest frequency If there are two or more scores that are tied for the largest frequency, report each of them If there are two or more scores that are tied for the largest frequency, report each of them

Copyright © Allyn & Bacon (2010) Computing the Median Order the scores from smallest to largest Order the scores from smallest to largest Determine the middle score [(N+1)/2] Determine the middle score [(N+1)/2] –If 7 scores, the middle is the fourth score [(7+1)/2]=4 –If 10 scores, the middle score is half way between the 5 th and 6 th scores [(10+1)/2]=5.5

Copyright © Allyn & Bacon (2010) Computing the Mean Compute the mean of 3, 4, 2, 5, 7, & 5 Compute the mean of 3, 4, 2, 5, 7, & 5 Sum the numbers (26) Sum the numbers (26) Count the number of scores (6) Count the number of scores (6) Plug these values into the equations Plug these values into the equations

Copyright © Allyn & Bacon (2010) Measuring Variability Range: lowest to highest score Range: lowest to highest score Average Deviation: average distance from the mean Average Deviation: average distance from the mean Variance: average squared distance from the mean Variance: average squared distance from the mean –Used in later inferential statistics Standard Deviation: square root of variance Standard Deviation: square root of variance

Copyright © Allyn & Bacon (2010) The Range Computing the Range Computing the Range –Find the lowest score –Find the highest score –Subtract the lowest from the highest score Easy to compute, but unstable because it relies on only two scores Easy to compute, but unstable because it relies on only two scores

Copyright © Allyn & Bacon (2010) The Average Deviation Computing the average deviation Computing the average deviation –Compute the mean –Compute the distance of each score from the mean (absolute distance, ignore sign) –Sum those distances and divide by the number of scores Easy to understand conceptually, but rarely used because it does not have good statistical properties Easy to understand conceptually, but rarely used because it does not have good statistical properties

Copyright © Allyn & Bacon (2010) The Variance Computing the Variance Computing the Variance –Compute the mean –Compute the distance of each score from the mean –Square those distances –Sum those squared distances and divide by the degrees of freedom (N - 1) Good statistical properties, but this measure of variability is in squared units Good statistical properties, but this measure of variability is in squared units

Copyright © Allyn & Bacon (2010) The Standard Deviation Computing the Standard Deviation Computing the Standard Deviation –Compute the variance –Take the square root of the variance This measure, like the variance, has good statistical properties and is measured in the same units as the mean This measure, like the variance, has good statistical properties and is measured in the same units as the mean

Copyright © Allyn & Bacon (2010) Measures of Relationship Pearson product-moment correlation Pearson product-moment correlation –Used with interval or ratio data Spearman rank-order correlation Spearman rank-order correlation –Used when one variable is ordinal and the second is at least ordinal Phi Phi –Used when at least one variable is nominal

Copyright © Allyn & Bacon (2010) Correlations Range from –1.00 to +1.00 Range from –1.00 to +1.00 –A -1.00 means a perfect negative relationship (as one score decreases, the other increases a predictable amount) –+1.00 means a perfect positive relationship –0.00 means that there is no relationship

Copyright © Allyn & Bacon (2010) Linear Relationships Correlation coefficients are sensitive only to linear relationships Correlation coefficients are sensitive only to linear relationships Linear relationships mean that the points of a scatter plot cluster around a straight line Linear relationships mean that the points of a scatter plot cluster around a straight line Should always look at the scatter plot to see whether the correlation coefficient is appropriate Should always look at the scatter plot to see whether the correlation coefficient is appropriate

Scatter Plots Allows you to see the relationship of two variables Allows you to see the relationship of two variables Detects nonlinear relationships or correlations that are due to only a few outliers Detects nonlinear relationships or correlations that are due to only a few outliers Create a Scatter Plot by: Create a Scatter Plot by: –Labeling the X and Y axis with sufficient range to handle the scores –Graph each data point (defined by the X and Y scores) Copyright © Allyn & Bacon (2010)

Positive Correlation (check to see if compatible with PowerPoint 2003; next three slides) Copyright © Allyn & Bacon (2010)

Regression Using a correlation to predict one variable from knowing the score on the other variable Using a correlation to predict one variable from knowing the score on the other variable Usually a linear regression (finding the best fitting straight line for the data) Usually a linear regression (finding the best fitting straight line for the data) Best illustrated in a scatter plot with the regression line also plotted Best illustrated in a scatter plot with the regression line also plotted

Copyright © Allyn & Bacon (2010) Reliability Indices Test-retest reliability and interrater reliability are indexed with a Pearson product-moment correlation Test-retest reliability and interrater reliability are indexed with a Pearson product-moment correlation Internal consistency reliability is indexed with coefficient alpha Internal consistency reliability is indexed with coefficient alpha Details on these computations are included on the Student Resource Website Details on these computations are included on the Student Resource Website

Copyright © Allyn & Bacon (2010) Standard Scores (Z-scores) A way to put scores on a common scale A way to put scores on a common scale Computed by subtracting the mean from the score and dividing by the standard deviation Computed by subtracting the mean from the score and dividing by the standard deviation Interpreting the Z-score Interpreting the Z-score –Positive Z-scores are above the mean; negative Z-scores are below the mean –The larger the absolute value of the Z-score, the further the score is from the mean

Copyright © Allyn & Bacon (2010) Inferential Statistics Used to draw inferences about populations on the basis of samples Used to draw inferences about populations on the basis of samples Sometimes called “statistical tests” Sometimes called “statistical tests” Provide an objective way of quantifying the strength of the evidence for a hypothesis Provide an objective way of quantifying the strength of the evidence for a hypothesis

Copyright © Allyn & Bacon (2010) Populations and Samples Population: the larger groups of all participants of interest Population: the larger groups of all participants of interest Sample: a subset of the population Sample: a subset of the population Samples almost never represent populations perfectly (sampling error) Samples almost never represent populations perfectly (sampling error) –Not really an error –Just the natural variability that you can expect from one sample to another

Copyright © Allyn & Bacon (2010) The Null Hypothesis States that there is NO difference between the population means States that there is NO difference between the population means Compare sample means to test the null hypothesis Compare sample means to test the null hypothesis Test the Null Hypothesis in terms of probability of it being true Test the Null Hypothesis in terms of probability of it being true

Copyright © Allyn & Bacon (2010) Statistical Decisions Either Reject or Fail to Reject the null hypothesis Either Reject or Fail to Reject the null hypothesis –Rejecting the null hypothesis suggests that there is a difference in the populations sampled –Failing to reject suggests that no difference exists –Decision is based on probability –Alpha: the statistical decision criteria used in testing the null hypothesis –Traditionally, alpha is set to small values (.05 or.01) Always a chance for error in our decision Always a chance for error in our decision

Copyright © Allyn & Bacon (2010) Statistical Decision Process Reject Null Hypothesis Retain Null Hypothesis Null Hypothesis is True Type I Error Correct Decision Null Hypothesis is False Correct Decision Type II Error

Copyright © Allyn & Bacon (2010) Testing for Mean Differences t-test : tests mean difference of two groups t-test : tests mean difference of two groups Analysis of Variance: tests mean differences in two or more groups Analysis of Variance: tests mean differences in two or more groups There are different versions of each of these tests depending on the experimental design (covered in later chapters) There are different versions of each of these tests depending on the experimental design (covered in later chapters)

Copyright © Allyn & Bacon (2010) Power of a Statistical Test Sensitivity of the procedure to detect real differences between populations Sensitivity of the procedure to detect real differences between populations A function of both the statistical test and the precision of the research design A function of both the statistical test and the precision of the research design Increasing the sample size increases the power Increasing the sample size increases the power –Larger samples estimate the population parameters more precisely

Copyright © Allyn & Bacon (2010) Effect Size Indication the size of the group differences expressed in standard deviation units Indication the size of the group differences expressed in standard deviation units Unlike the statistical test, the effect size is NOT affected by the size of the sample Unlike the statistical test, the effect size is NOT affected by the size of the sample Large effect size are easier to detect than small effect sizes Large effect size are easier to detect than small effect sizes

Ethical Principles Nothing inherently unethical about statistics Nothing inherently unethical about statistics Researchers have an ethical obligation Researchers have an ethical obligation –To present the data objectively –To present data both supportive and not supportive of the hypothesis –To present enough statistical information for people to make up their own minds Copyright © Allyn & Bacon (2010)

Summary Statistics allow us to detect and evaluate group differences that are small compared to individual differences Statistics allow us to detect and evaluate group differences that are small compared to individual differences Descriptive versus inferential statistics Descriptive versus inferential statistics –Descriptive statistics describe the data –Inferential statistics are used to draw inferences about population parameters on the basis of sample statistics Statistics objectify evaluations, but do not guarantee correct decisions Statistics objectify evaluations, but do not guarantee correct decisions

Download ppt "Copyright © Allyn & Bacon (2010) Statistical Analysis of Data Graziano and Raulin Research Methods: Chapter 5 This multimedia product and its contents."

Similar presentations