Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency.

Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency – describes a dataset. A.Three kinds: mean, median, mode. 1.Mean: most common. Sum all the values in a group, divide by the total number of values in that group (Hint: start listing them in columns/headings).

Weighted Mean: Multiply each value by its frequency. Sum. Divide by total frequency. 2.Median: the mean is very sensitive to outlier scores that skew the distribution; median is not. It is the midpoint value. Instructions: order all values. Find the middle-most score. That’s the median (if even number of cases, find middle-most two values; add them, divide by two). Percentiles: 50 th percentile is the median. 75 th percentile means score is at or above 75% of the other scores. 3.Mode: most frequent value. B.When to use what. 1.Three kinds of data a.Nominal – categorical data (race, region). b.Ordinal – values are ranked, but not necessarily equal in distance (7 values indicating GOP support). c.Interval – values are equal in distance (income). 2.Use mean for interval (and sometimes ordinal). Use mode for nominal (and sometimes ordinal). Use median for interval if you think there are outliers.

II. Variability – how much scores differ from one another. Which set of scores has greater variability? Set 1: 8,9,5,2,1,3,1,9 Set 2: 3,4,3,5,4,6,2,3 Means are Set 1: 4.75 and Set 2: 3.75. Tells us nothing of variability. Variability is more precisely how different/far scores are from the mean. III. Computing the Range Subtract the lowest score from the highest (r=h-l) What is the range of these scores? 98,86,77,56,48 Answer: 50 (98-48=50) IV. Computing the Standard Deviation The standard deviation (s) is the average amount of variability in a set of scores (average distance from mean).

A.Formula: Compute s for the following: 5,8,5,4,6,7,8,8,3,6 So, an s of 1.76 tells us that each score differs from the mean by an average of 1.76 points. B.Purpose: to compare scores between different distributions, even when the means and standard deviations are different (e.g., men and women). Larger the s the greater the variability.

V. Graphing and Tables. Why? Describes data visually, more clearly. Frequency Distribution (Table 11-4) A.Class Interval Column – divides the scores up into categories (0-4, 5-9, etc.). Usually range of 2,5,10, or 25 data points. Main thing: be consistent! B.Frequency Column – number of scores within that range or category. VI. Graphs A.Histogram – shows the distribution of scores by class interval. Can compare different distributions on the same histogram. Shows: 1.Variability 2.Skewness - If the mean is greater than the median, positive skewness. If median is greater than mean, negative skewness.

Central Tendency and Variability Centre

Central Tendency and Variability Spread

Skewness If the data set is symmetric, the mean equals the median. MeanMedian

Skewness If the data set is skewed to the right, the mean is greater than the median. Mean Median

Skewness If the data set is skewed to the left, the mean is less than the median. Mean Median

B. Column Charts – simply tells the quantity of a category according to some scale. SCALE IS IMPORTANT (CSPAN- drug use story). C.Bar Charts – same as Column chart, but reverse the axes. D.Line Chart – Used to show trends (e.g. rise and fall in presidential popularity – line on page 317). E.Pie Charts – Great for proportions (percent of MS budget going to each budget category).

Line Graph

VII. The Normal Curve and Probability Theory A. Tells us likelihood of an outcome B.Tells us degree of confidence in a finding or outcome (i.e., how sure are we that the observed outcome is due to X versus random chance? AND how likely is it that our research hypothesis is true?). VIII. Normal Curve or Bell-Shaped Curve Properties (Fig. 11-6) A. Mean, median and mode are same NOT Skewed

B. Perfectly symmetrical about the mean (i.e., two halves fit perfectly together). C. Tails of the normal curve are asymptotic. Curves come close, but never touch the horizontal axis. Are curves usually normal? Yes, especially with large sets of data (more than 30). Most scores are concentrated in the center and few are concentrated at the ends (height, intelligence, coin flipping).

IX. Divisions of the Normal Curve (Fig. 11-9) A.Mean is at the center B. Scores along x-axis correspond to standard deviations. C. Sections within the bell curve represent % of cases expected to fall therein. Geometrically true (these are percentages of entire normal distribution). D. For normal distributions (most data sets), practically all scores fall in between +3 and -3 sd’s (99.74%). Look at the probabilities of falling in between. 34.13% x 2 = 68.26% cases fall within 1 to -1 sd’s from mean.

X. Z-scores (standard scores; i.e. the # of standard deviations from the mean) A. Allow us to compare distributions with one another because they are scores that are standardized in units of standard deviations (can’t compare scores if they are measured differently; nonsensical). Different variables or groups will have different means and cannot be compared. But z-scores between groups of data can be compared because they are equivalent (e.g., one unit above or below the mean, respectively).

B. Formula and interpretation VII.Comparing z-scores from different distributions (p. 158 example). -The raw scores of 12.8 and 64.8 in our data are equal distances from their respective means (z=.4 for both) VIII.What z-scores represent A. Z-scores correspond to sections under the curve (percentages under the curve).

B. These percentages can be seen as probabilities of a certain score occurring given in Appendix D. Example of what we are saying: “In a distribution with a mean of 100 and standard deviation of 10, what is the probability that any score will be 110 or above?” The answer = _________. C. What about a z-score of 1.38? What are the chances that a score will fall within the mean and a z-score of 1.38? _______ What about above a z-score of 1.38?____ What about at or below 1.38?______

What about between a z-score of 1 and 2.5? Answer:______ (look at picture 11-9) Again, we are asking, what is the probability that a score will fall in between 1 and 2.5 standard deviations (z’s) of the mean? -1 and 2.5?

Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency.

Similar presentations

Presentation on theme: "Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency.

Similar presentations

Presentation on theme: "Chapter 11 Univariate Data Analysis; Descriptive Statistics These are summary measurements of a single variable. I.Averages or measures of central tendency."— Presentation transcript:

Similar presentations

About project

Feedback