Today: Central Tendency & Dispersion

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

SPSS Review CENTRAL TENDENCY & DISPERSION
Agricultural and Biological Statistics
Measures of Central Tendency. Central Tendency “Values that describe the middle, or central, characteristics of a set of data” Terms used to describe.
Calculating & Reporting Healthcare Statistics
Descriptive Statistics Chapter 3 Numerical Scales Nominal scale-Uses numbers for identification (student ID numbers) Ordinal scale- Uses numbers for.
PSY 307 – Statistics for the Behavioral Sciences
Descriptive Statistics
SOC 3155 SPSS CODING/GRAPHS & CHARTS CENTRAL TENDENCY & DISPERSION.
Analysis of Research Data
Measures of Variability
Chapter 3: Central Tendency
1 Measures of Central Tendency Greg C Elvers, Ph.D.
Measures of Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Measures of Central Tendency
Objective To understand measures of central tendency and use them to analyze data.
Frequency Distribution I. How Many People Made Each Possible Score? A. This is something I show you for each quiz.
BIOSTATISTICS II. RECAP ROLE OF BIOSATTISTICS IN PUBLIC HEALTH SOURCES AND FUNCTIONS OF VITAL STATISTICS RATES/ RATIOS/PROPORTIONS TYPES OF DATA CATEGORICAL.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3 Statistical Concepts.
Psychometrics.
Statistics. Question Tell whether the following statement is true or false: Nominal measurement is the ranking of objects based on their relative standing.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 16 Descriptive Statistics.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Measures of Central Tendency or Measures of Location or Measures of Averages.
Types of data and how to present them 47:269: Research Methods I Dr. Leonard March 31, :269: Research Methods I Dr. Leonard March 31, 2010.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
PPA 501 – Analytical Methods in Administration Lecture 5a - Counting and Charting Responses.
Describing Behavior Chapter 4. Data Analysis Two basic types  Descriptive Summarizes and describes the nature and properties of the data  Inferential.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Describing Data Lesson 3. Psychology & Statistics n Goals of Psychology l Describe, predict, influence behavior & cognitive processes n Role of statistics.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
A way to organize data so that it has meaning!.  Descriptive - Allow us to make observations about the sample. Cannot make conclusions.  Inferential.
Measures of Central Tendency: The Mean, Median, and Mode
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Central Tendency & Dispersion
Unit 2 (F): Statistics in Psychological Research: Measures of Central Tendency Mr. Debes A.P. Psychology.
IE(DS)1 Descriptive Statistics Data - Quantitative observation of Behavior What do numbers mean? If we call one thing 1 and another thing 2 what do we.
LIS 570 Summarising and presenting data - Univariate analysis.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Descriptive Statistics(Summary and Variability measures)
A way to organize data so that it has meaning!.  Descriptive - Allow us to make observations about the sample. Cannot make conclusions.  Inferential.
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
Welcome to… The Exciting World of Descriptive Statistics in Educational Assessment!
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Statistical Methods Michael J. Watts
Statistical Methods Michael J. Watts
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Measures of Central Tendency
Central Tendency and Variability
Description of Data (Summary and Variability measures)
Module 8 Statistical Reasoning in Everyday Life
Descriptive Statistics
Introduction to Statistics
Descriptive Statistics
MEASURES OF CENTRAL TENDENCY
Chapter 3: Central Tendency
Central Tendency & Variability
Presentation transcript:

Today: Central Tendency & Dispersion From frequency tables to distributions Types of Distributions: Normal, Skewed Level of Measurement: Nominal, Ordinal, Interval Central Tendency: Mode, Median, Mean Dispersion: Variance, Standard Deviation

Descriptive statistics are concerned with describing the characteristics of frequency distributions Where is the center? What is the range? What is the shape [of the distribution]?

Frequency Distributions Simple depiction of all the data Graphic — easy to understand Problems Not always precisely measured Not summarized in one number or datum

Frequency Table Test Scores Observation Frequency 65 1 70 2 75 3 80 4 85 3 90 2 95 1

Frequency Distributions 4 3 2 1 Frequency 65 70 75 80 85 90 95 Test Score

Voter Turnout in 50 States - 1980

Voter Turnout in 50 States - 1940

Normally Distributed Curve

Skewed Distributions

Characteristics of the Normal Distribution It is symmetrical -- Half the cases are to one side of the center; the other half is on the other side. The distribution is single peaked, not bimodal or multi-modal Most of the cases will fall in the center portion of the curve and as values of the variable become more extreme they become less frequent, with “outliers” at each of the “tails” of the distribution few in number. It is only one of many frequency distributions but the one we will focus on for most of this course. The Mean, Median, and Mode are the same. Percentage of cases in any range of the curve can be calculated.

Family of Normal Curves

Summarizing Distributions Two key characteristics of a frequency distribution are especially important when summarizing data or when making a prediction from one set of results to another: Central Tendency What is in the “Middle”? What is most common? What would we use to predict? Dispersion How Spread out is the distribution? What Shape is it?

Three measures of central tendency are commonly used in statistical analysis - the mode, the median, and the mean Each measure is designed to represent a typical score The choice of which measure to use depends on: the shape of the distribution (whether normal or skewed), and the variable’s “level of measurement” (data are nominal, ordinal or interval).

Appropriate Measures of Central Tendency Nominal variables Mode Ordinal variables Median Interval level variables Mean - If the distribution is normal (median is better with skewed distribution)

Mode Most Common Outcome Male Female

Median Middle-most Value 50% of observations are above the Median, 50% are below it The difference in magnitude between the observations does not matter Therefore, it is not sensitive to outliers Formula Median = n + 1 / 2

To compute the median · then count number of observations = 10. · first you rank order the values of X from low to high:  85, 94, 94, 96, 96, 96, 96, 97, 97, 98 · then count number of observations = 10. · add 1 = 11. · divide by 2 to get the middle score  the 5 ½ score here 96 is the middle score score

Median Find the Median 4 5 6 6 7 8 9 10 12 5 6 6 7 8 9 10 12 5 6 6 7 8 9 10 100,000

Mean - Average Most common measure of central tendency Best for making predictions Applicable under two conditions: scores are measured at the interval level, and distribution is more or less normal [symmetrical]. Symbolized as: for the mean of a sample μ for the mean of a population

Finding the Mean X = (Σ X) / N If X = {3, 5, 10, 4, 3} = 25 / 5 = 5

Find the Mean Q: 4, 5, 8, 7 A: 6 Median: 6 Q: 4, 5, 8, 1000 A: 254.25

IF THE DISTRIBUTION IS NORMAL Mean is the best measure of central tendency Most scores “bunched up” in middle Extreme scores less frequent  don’t move mean around.

Measures of Variability Central Tendency doesn’t tell us everything Dispersion/Deviation/Spread tells us a lot about how a variable is distributed. We are most interested in Standard Deviations (σ) and Variance (σ2)

Why can’t the mean tell us everything? Mean describes Central Tendency, what the average outcome is. We also want to know something about how accurate the mean is when making predictions. The question becomes how good a representation of the distribution is the mean? How good is the mean as a description of central tendency -- or how good is the mean as a predictor? Answer -- it depends on the shape of the distribution. Is the distribution normal or skewed?

Family of Normal Distribution Curves

Dispersion Once you determine that the variable of interest is normally distributed, ideally by producing a histogram of the scores, the next question to be asked about the NDC is its dispersion: how spread out are the scores around the mean. Dispersion is a key concept in statistical thinking. The basic question being asked is how much do the scores deviate around the Mean? The more “bunched up” around the mean the better your ability to make accurate predictions.

Means What is the difference? Consider these means for weekly candy bar consumption. X = {7, 8, 6, 7, 7, 6, 8, 7} X = (7+8+6+7+7+6+8+7)/8 X = 7 X = {12, 2, 0, 14, 10, 9, 5, 4} X = (12+2+0+14+10+9+5+4)/8 X = 7 What is the difference?

How well does the mean represent the scores in a distribution How well does the mean represent the scores in a distribution? The logic here is to determine how much spread is in the scores. How much do the scores "deviate" from the mean? Think of the mean as the true score or as your best guess. If every X were very close to the Mean, the mean would be a very good predictor. If the distribution is very sharply peaked then the mean is a good measure of central tendency and if you were to use the mean to make predictions you would be right or close much of the time.

What if scores are widely distributed? The mean is still your best measure and your best predictor, but your predictive power would be less. How do we describe this? Measures of variability Mean Deviation Variance Standard Deviation

Mean Deviation The key concept for describing normal distributions and making predictions from them is called deviation from the mean. We could just calculate the average distance between each observation and the mean. We must take the absolute value of the distance, otherwise they would just cancel out to zero! Formula:

Mean Deviation: An Example Data: X = {6, 10, 5, 4, 9, 8} X = 42 / 6 = 7 X – Xi Abs. Dev. 7 – 6 1 7 – 10 3 7 – 5 2 7 – 4 7 – 9 7 – 8 Compute X (Average) Compute X – X and take the Absolute Value to get Absolute Deviations Sum the Absolute Deviations Divide the sum of the absolute deviations by N Total: 12 12 / 6 = 2

What Does it Mean? Is it Really that Easy? On Average, each observation is two units away from the mean. Is it Really that Easy? No! Absolute values are difficult to manipulate algebraically Absolute values cause enormous problems for calculus (Discontinuity) We need something else…

Variance and Standard Deviation Instead of taking the absolute value, we square the deviations from the mean. This yields a positive value. This will result in measures we call the Variance and the Standard Deviation Sample- Population- s: Standard Deviation σ: Standard Deviation s2: Variance σ2: Variance

Calculating the Variance and/or Standard Deviation Formulae: Variance: Examples Follow . . . Standard Deviation:

Example: Mean: 6 10 5 4 9 8 -1 1 3 9 -2 4 -3 2 Variance: Data: X = {6, 10, 5, 4, 9, 8}; N = 6 Mean: 6 10 5 4 9 8 -1 1 3 9 -2 4 -3 2 Variance: Standard Deviation: Total: 42 Total: 28