Descriptive Statistics: Overview Measures of Center Mode Median Mean * Measures of Symmetry Skewness Measures of Spread Range Inter-quartile Range Variance.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

Measures of Dispersion
Descriptive Statistics. Frequency Distributions a tallying of the number of times (frequency) each score value (or interval of score values) is represented.
Descriptive Statistics
Statistics.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Statistics for the Social Sciences
Calculating & Reporting Healthcare Statistics
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Variability Ibrahim Altubasi, PT, PhD The University of Jordan.
Chapter 5: Variability and Standard (z) Scores How do we quantify the variability of the scores in a sample?
Central Tendency and Variability
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
1 Measures of Central Tendency Greg C Elvers, Ph.D.
Measures of Central Tendency
Measures of Central Tendency
Describing Data: Numerical
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Measures of Dispersion
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Measures of Center.
Lecture 3 Describing Data Using Numerical Measures.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Statistics 11 The mean The arithmetic average: The “balance point” of the distribution: X=2 -3 X=6+1 X= An error or deviation is the distance from.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
CHAPTER 3  Descriptive Statistics Measures of Central Tendency 1.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
Variability Pick up little assignments from Wed. class.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
1 Measures of Center. 2 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely.
Measures of Location INFERENTIAL STATISTICS & DESCRIPTIVE STATISTICS Statistics of location Statistics of dispersion Summarise a central pointSummarises.
Central Tendency. Variables have distributions A variable is something that changes or has different values (e.g., anger). A distribution is a collection.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data Descriptive Statistics: Central Tendency and Variation.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Summary Statistics: Measures of Location and Dispersion.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
CHAPTER 2: Basic Summary Statistics
Chapter 2 Describing and Presenting a Distribution of Scores.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Descriptive Statistics: Overview
Central Tendency and Variability
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
Measures of Location Statistics of location Statistics of dispersion
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Descriptive Statistics
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

Descriptive Statistics: Overview Measures of Center Mode Median Mean * Measures of Symmetry Skewness Measures of Spread Range Inter-quartile Range Variance Standard deviation * * Measures of Position Percentile Deviation Score Z-score * *

Central tendency Seeks to provide a single value that best represents a distribution

Central tendency

Seeks to provide a single value that best represents a distribution Typical measures are –mode –median –mean

Mode the most frequently occurring score value corresponds to the highest point on the frequency distribution For a given sample N=16: The mode = 39

Mode The mode is not sensitive to extreme scores. For a given sample N=16: The mode = 39

Mode a distribution may have more than one mode For a given sample N=16: The modes = 35 and 39

Mode there may be no unique mode, as in the case of a rectangular distribution For a given sample N=16: No unique mode

Median the score value that cuts the distribution in half (the “middle” score) 50th percentile For N = 15 the median is the eighth score = 37

Median For N = 16 the median is the average of the eighth and ninth scores = 37.5

Mean this is what people usually have in mind when they say “average” the sum of the scores divided by the number of scores Changing the value of a single score may not affect the mode or median, but it will affect the mean. For a population:For a sample:

Mean X=7.07 In many cases the mean is the preferred measure of central tendency, both as a description of the data and as an estimate of the parameter. __ In order for the mean to be meaningful, the variable of interest must be measures on an interval scale Buddhist Protestant Catholic Jewish Muslim Score Frequency X=2.4 __

Mean The mean is sensitive to extreme scores and is appropriate for more symmetrical distributions. X=36.8 __ X=36.5 __ X=93.2 __

a symmetrical distribution exhibits no skewness in a symmetrical distribution the Mean = Median = Mode Symmetry

Skewness refers to the asymmetry of the distribution Skewed distributions A positively skewed distribution is asymmetrical and points in the positive direction. Mode = 70,000$ Median = 88,700$ Mean = 93,600$ modemean median mode < median < mean

A negatively skewed distribution Skewed distributions mode > median > mean mode mean median

Measures of central tendency +- Mode quick & easy to compute useful for nominal data poor sampling stability Median not affected by extreme scoressomewhat poor sampling stability Mean sampling stability related to variance inappropriate for discrete data affected by skewed distributions

Distributions Center: mode, median, mean Shape: symmetrical, skewed Spread

Measures of Spread the dispersion of scores from the center a distribution of scores is highly variable if the scores differ wildly from one another Three statistics to measure variability –range –interquartile range –variance

Range largest score minus the smallest score these two have same range (80) but spreads look different says nothing about how scores vary around the center greatly affected by extreme scores (defined by them)

Interquartile range the distance between the 25th percentile and the 75th percentile Q3-Q1 = = 40 Q3-Q1 = = 5 effectively ignores the top and bottom quarters, so extreme scores are not influential dismisses 50% of the distribution

Deviation measures Might be better to see how much scores differ from the center of the distribution -- using distance Scores further from the mean have higher deviation scores ScoreDeviation Amy10-40 Theo20-30 Max30-20 Henry40-10 Leticia500 Charlotte6010 Pedro7020 Tricia8030 Lulu9040 AVERAGE50

Deviation measures To see how ‘deviant’ the distribution is relative to another, we could sum these scores But this would leave us with a big fat zero ScoreDeviation Amy10-40 Theo20-30 Max30-20 Henry40-10 Leticia500 Charlotte6010 Pedro7020 Tricia8030 Lulu9040 SUM0

Deviation measures So we use squared deviations from the mean ScoreDeviation Sq. Deviation Amy Theo Max Henry Leticia5000 Charlotte Pedro Tricia Lulu SUM06000 This is the sum of squares (SS) SS= ∑(X-X) 2 __

Variance We take the “average” squared deviation from the mean and call it VARIANCE (to correct for the fact that sample variance tends to underestimate pop variance) For a population: For a sample:

Variance 1.Find the mean. 2.Subtract the mean from every score. 3.Square the deviations. 4.Sum the squared deviations. 5.Divide the SS by N or N-1. ScoreDev’nSq. Dev. Amy Theo Max Henry Leticia5000 Charlotte Pedro Tricia Lulu SUM /8 =750

The standard deviation is the square root of the variance The standard deviation measures spread in the original units of measurement, while the variance does so in units squared. Variance is good for inferential stats. Standard deviation is nice for descriptive stats. Standard deviation

Example N = 28 X = 50 s 2 = s = N = 28 X = 50 s 2 = s = 11.86

Descriptive Statistics: Quick Review Measures of Center Mode Median Mean ** Measures of Symmetry Skewness Measures of Spread Range Inter-quartile Range Variance Standard deviation ** **

Descriptive Statistics: Quick Review For a population:For a sample: MeanVariance Standard Deviation

Treat this little distribution as a sample and calculate: –Mode, median, mean –Range, variance, standard deviation Exercise

Descriptive Statistics: Overview Measures of Center Mode Median Mean * Measures of Symmetry Skewness Measures of Spread Range Inter-quartile Range Variance Standard deviation * * Measures of Position Percentile Deviation Score Z-score * *

Measures of Position How to describe a data point in relation to its distribution

Quantile Deviation Score Z-score Measures of Position

Quantiles Quartile Divides ranked scores into four equal parts 25% (minimum)(maximum) (median)

Quantiles 10% Divides ranked scores into ten equal parts Decile

Quantiles Divides ranked scores into 100 equal parts Percentile rank of score x = 100 number of scores less than x total number of scores Percentile rank

Deviation Scores ScoreDeviation Amy10-40 Theo20-30 Max30-20 Henry40-10 Leticia500 Charlotte6010 Pedro7020 Tricia8030 Lulu9040 Average50 For a population: For a sample:

What if we want to compare scores from distributions that have different means and standard deviations? Example –Nine students scores on two different tests –Tests scored on different scales

Nine Students on Two Tests Test 1Test 2 Amy101 Theo202 Max303 Henry404 Leticia505 Charlotte606 Pedro707 Tricia808 Lulu909 Average505

Nine Students on Two Tests Test 1Test 2 Deviation Score 1 Deviation Score 2 Amy Theo Max Henry Leticia50500 Charlotte Pedro Tricia Lulu Average505

Z-Scores Z-scores modify a distribution so that it is centered on 0 with a standard deviation of 1 Subtract the mean from a score, then divide by the standard deviation For a population:For a sample:

Z-Scores Test 1Test 2Z- Score 1Z-Score 2 Amy Theo Max Henry Leticia50500 Charlotte Pedro Tricia Lulu Average50500 St Dev

A distribution of Z-scores… Z-Scores Always has a mean of zero Always has a standard deviation of 1 Converting to standard or z scores does not change the shape of the distribution: z scores cannot normalize a non-normal distribution A Z-score is interpreted as “number of standard deviations above/below the mean”

Exercise Test 3Z-Score Amy52 Theo39 Max-1.5 Henry1.3 On their third test, the class average was 45 and the standard deviation was 6. Fill in the rest.

Descriptive Statistics: Quick Review For a population:For a sample: Mean Variance Z-score Standard Deviation

:If you add or subtract a constant from each value in a distribution, then the mean is increased/decreased by that amount the standard deviation is unchanged the z-scores are unchanged 6 If you multiply or divide each value in a distribution by a constant, then the mean is multiplied/divided by that amount the standard deviation is multiplied/divided by that amount the z-scores are unchanged Messing with Units

Example ScoreDev’sSq devZ-score Theo Max Henry 51.5 Leticia Charlotte Pedro 824 Tricia Lulu MEAN 6 STDEV 1.94

Adding 1 ScoreDev’sSq devZ-score Theo Max Henry 61.5 Leticia Charlotte Pedro 924 Tricia Lulu MEAN 7 STDEV 1.94

Example ScoreDev’sSq devZ-score Theo Max Henry 51.5 Leticia Charlotte Pedro 824 Tricia Lulu MEAN 6 STDEV 1.94

Multiplying by 10 ScoreDev’sSq devZ-score Theo Max Henry Leticia Charlotte Pedro Tricia Lulu MEAN 60 STDEV 19.4

Other Standardized Distributions The Z distribution is not the only standardized distribution. You can easily create others (it’s just messing with units, really).

Score Theo5 Max3 Henry5 Leticia7 Charlotte7 Pedro8 Tricia4 Lulu9 Average6 St Dev1.94 Example: Let’s change these test scores into ETS type scores (mean 500, stdev 100) Other Standardized Distributions

ScoreZ-Score ETS type score Theo Max Henry Leticia Charlotte Pedro4400 Tricia Lulu Average60500 St Dev Here’s How: Convert to Z scores Multiply by 100 to increase the st dev Add 500 to increase the mean Other Standardized Distributions

Exercise ScorePercentile Deviation ScoreZ-Score IQ type score (Mean 100 Stdev 10) Theo20 Max18 Henry13 Leticia17 Charlotte19 Pedro16 Tricia11 Lulu9