Statistics for the Social Sciences

Slides:



Advertisements
Similar presentations
Statistics for the Social Sciences Psychology 340 Fall 2006 Distributions.
Advertisements

Basic Statistical Concepts
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Measures of Central Tendency& Variability.
PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 3 Chicago School of Professional Psychology.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
PSY 307 – Statistics for the Behavioral Sciences
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Central Tendency 2011, 9, 27. Today’s Topics  What is central tendency?  Three central tendency measures –Mode –Median * –Mean *
Lecture 8: z-Score and the Normal Distribution 2011, 10, 6.
Introduction to Educational Statistics
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Data observation and Descriptive Statistics
Central Tendency and Variability
Chapter 3: Central Tendency
Measures of Central Tendency
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
Today: Central Tendency & Dispersion
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Measurement Tools for Science Observation Hypothesis generation Hypothesis testing.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
Chapter 3 EDRS 5305 Fall 2005 Gravetter and Wallnau 5 th edition.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Overview Summarizing Data – Central Tendency - revisited Summarizing Data – Central Tendency - revisited –Mean, Median, Mode Deviation scores Deviation.
Reasoning in Psychology Using Statistics Psychology
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Warsaw Summer School 2014, OSU Study Abroad Program Variability Standardized Distribution.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Descriptive Statistics: Presenting and Describing Data.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Introduction A Review of Descriptive Statistics. Charts When dealing with a larger set of data values, it may be clearer to summarize the data by presenting.
Measures of Central Tendency: The Mean, Median, and Mode
Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
LIS 570 Summarising and presenting data - Univariate analysis.
Introduction to statistics I Sophia King Rm. P24 HWB
Today: Standard Deviations & Z-Scores Any questions from last time?
Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 2 The Mean, Variance, Standard.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Anthony J Greene1 Central Tendency 1.Mean Population Vs. Sample Mean 2.Median 3.Mode 1.Describing a Distribution in Terms of Central Tendency 2.Differences.
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
CHAPTER 2: Basic Summary Statistics
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 4-6 Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead Tutor © 2013.
Welcome to… The Exciting World of Descriptive Statistics in Educational Assessment!
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Chapter 4 Variability PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter and Larry.
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Central Tendency and Variability
Descriptive Statistics: Presenting and Describing Data
Chapter 2 The Mean, Variance, Standard Deviation, and Z Scores
Statistics for the Social Sciences
Summary (Week 1) Categorical vs. Quantitative Variables
Summary (Week 1) Categorical vs. Quantitative Variables
The Mean Variance Standard Deviation and Z-Scores
Presentation transcript:

Statistics for the Social Sciences Psychology 340 Fall 2006 Introductions

Outline (for week) Variables: IV, DV, scales of measurement Discuss each variable and it’s scale of measurement Characteristics of Distributions Using graphs Using numbers (center and variability) Descriptive statistics decision tree Locating scores: z-scores and other transformations

Outline (for week) Variables: IV, DV, scales of measurement Discuss each variable and it’s scale of measurement Characteristics of Distributions Using graphs Using numbers (center and variability) Descriptive statistics decision tree Locating scores: z-scores and other transformations

Describing distributions Distributions are typically described with three properties: Shape: unimodal, symmetric, skewed, etc. Center: mean, median, mode Spread (variability): standard deviation, variance

Describing distributions Distributions are typically described with three properties: Shape: unimodal, symmetric, skewed, etc. Center: mean, median, mode Spread (variability): standard deviation, variance

Which center when? Depends on a number of factors, like scale of measurement and shape. The mean is the most preferred measure and it is closely related to measures of variability However, there are times when the mean isn’t the appropriate measure.

Which center when? Use the median if: The distribution is skewed The distribution is ‘open-ended’ (e.g. your top answer on your questionnaire is ‘5 or more’) Data are on an ordinal scale (rankings) Use the mode if the data are on a nominal scale

The Mean The most commonly used measure of center The arithmetic average Computing the mean Divide by the total number in the population The formula for the population mean is (a parameter): Add up all of the X’s The formula for the sample mean is (a statistic): Divide by the total number in the sample Note: your book uses ‘M’ to denote the mean in formulas

The Mean Number of shoes: 5, 7, 5, 5, 5 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Can we simply add the two means together and divide by 2? Suppose we want the mean of the entire group? NO. Why not?

The Weighted Mean Number of shoes: 5, 7, 5, 5, 5, 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Suppose we want the mean of the entire group? Can we simply add the two means together and divide by 2? NO. Why not? Need to take into account the number of scores in each mean

The Weighted Mean Number of shoes: Let’s check: 5, 7, 5, 5, 5, 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Both ways give the same answer Let’s check:

The median The median is the score that divides a distribution exactly in half. Exactly 50% of the individuals in a distribution have scores at or below the median. Case1: Odd number of scores in the distribution Step1: put the scores in order Step2: find the middle score Case2: Even number of scores in the distribution Step1: put the scores in order Step2: find the middle two scores Step3: find the arithmetic average of the two middle scores

The mode The mode is the score or category that has the greatest frequency. So look at your frequency table or graph and pick the variable that has the highest frequency. major mode minor mode so the mode is 5 so the modes are 2 and 8 Note: if one were bigger than the other it would be called the major mode and the other would be the minor mode

Describing distributions Distributions are typically described with three properties: Shape: unimodal, symmetric, skewed, etc. Center: mean, median, mode Spread (variability): standard deviation, variance

Variability of a distribution Variability provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together. In other words variabilility refers to the degree of “differentness” of the scores in the distribution. High variability means that the scores differ by a lot Low variability means that the scores are all similar

Standard deviation The standard deviation is the most commonly used measure of variability. The standard deviation measures how far off all of the scores in the distribution are from the mean of the distribution. Essentially, the average of the deviations. m

Computing standard deviation (population) Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 m -3 1 2 3 4 5 6 7 8 9 10 X -  = deviation scores 2 - 5 = -3

Computing standard deviation (population) Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 m -1 1 2 3 4 5 6 7 8 9 10 X -  = deviation scores 2 - 5 = -3 4 - 5 = -1

Computing standard deviation (population) Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, 8 m 1 1 2 3 4 5 6 7 8 9 10 X -  = deviation scores 2 - 5 = -3 6 - 5 = +1 4 - 5 = -1

Computing standard deviation (population) Step 1: Compute the deviation scores: Subtract the population mean from every score in the distribution. Our population 2, 4, 6, 8 m 3 1 2 3 4 5 6 7 8 9 10 X -  = deviation scores 2 - 5 = -3 6 - 5 = +1 Notice that if you add up all of the deviations they must equal 0. 4 - 5 = -1 8 - 5 = +3

Computing standard deviation (population) Step 2: Get rid of the negative signs. Square the deviations and add them together to compute the sum of the squared deviations (SS). SS =  (X - )2 2 - 5 = -3 4 - 5 = -1 6 - 5 = +1 8 - 5 = +3 X -  = deviation scores = (-3)2 + (-1)2 + (+1)2 + (+3)2 = 9 + 1 + 1 + 9 = 20

Computing standard deviation (population) Step 3: Compute the Variance (the average of the squared deviations) Divide by the number of individuals in the population. variance = 2 = SS/N

Computing standard deviation (population) Step 4: Compute the standard deviation. Take the square root of the population variance. standard deviation =  =

Computing standard deviation (population) To review: Step 1: compute deviation scores Step 2: compute the SS SS =  (X - )2 Step 3: determine the variance take the average of the squared deviations divide the SS by the N Step 4: determine the standard deviation take the square root of the variance

Computing standard deviation (sample) The basic procedure is the same. Step 1: compute deviation scores Step 2: compute the SS Step 3: determine the variance This step is different Step 4: determine the standard deviation

Computing standard deviation (sample) Step 1: Compute the deviation scores subtract the sample mean from every individual in our distribution. Our sample 2, 4, 6, 8 1 2 3 4 5 6 7 8 9 10 X X - X = deviation scores 2 - 5 = -3 6 - 5 = +1 4 - 5 = -1 8 - 5 = +3

Computing standard deviation (sample) Step 2: Determine the sum of the squared deviations (SS). 2 - 5 = -3 4 - 5 = -1 6 - 5 = +1 8 - 5 = +3 = (-3)2 + (-1)2 + (+1)2 + (+3)2 = 9 + 1 + 1 + 9 = 20 X - X = deviation scores SS =  (X - X)2 Apart from notational differences the procedure is the same as before

Computing standard deviation (sample) Step 3: Determine the variance Recall: Population variance = 2 = SS/N The variability of the samples is typically smaller than the population’s variability m X 3 X 1 X 4 X 2

Computing standard deviation (sample) Step 3: Determine the variance Recall: Population variance = 2 = SS/N The variability of the samples is typically smaller than the population’s variability To correct for this we divide by (n-1) instead of just n Sample variance = s2

Computing standard deviation (sample) Step 4: Determine the standard deviation standard deviation = s =

Properties of means and standard deviations Change/add/delete a given score changes changes Changes the total and the number of scores, this will change the mean and the standard deviation

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score All of the scores change by the same constant. X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score All of the scores change by the same constant. X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score All of the scores change by the same constant. X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score All of the scores change by the same constant. X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes All of the scores change by the same constant. But so does the mean X new

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes No change It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old X new

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes No change Multiply/divide a constant to each score 21 - 22 = -1 (-1)2 20 21 22 23 24 X 23 - 22 = +1 (+1)2 s =

Properties of means and standard deviations Change/add/delete a given score changes changes Add/subtract a constant to each score changes No change Multiply scores by 2 Multiply/divide a constant to each score changes changes 42 - 44 = -2 (-2)2 40 42 44 46 48 X 46 - 44 = +2 (+2)2 Sold=1.41 s =

Locating a score Where is our raw score within the distribution? The natural choice of reference is the mean (since it is usually easy to find). So we’ll subtract the mean from the score (find the deviation score). The direction will be given to us by the negative or positive sign on the deviation score The distance is the value of the deviation score

Locating a score m Reference point X1 = 162 X1 - 100 = +62 Direction

Locating a score m Reference point Below Above X1 = 162 X1 - 100 = +62

Transforming a score The distance is the value of the deviation score However, this distance is measured with the units of measurement of the score. Convert the score to a standard (neutral) score. In this case a z-score. Raw score Population mean Population standard deviation

Transforming scores m X1 = 162 X1 - 100 = +1.20 50 X2 = 57 A z-score specifies the precise location of each X value within a distribution. Direction: The sign of the z-score (+ or -) signifies whether the score is above the mean or below the mean. Distance: The numerical value of the z-score specifies the distance from the mean by counting the number of standard deviations between X and . X1 = 162 X1 - 100 = +1.20 50 X2 = 57 X2 - 100 = -0.86 50

Transforming a distribution We can transform all of the scores in a distribution We can transform any & all observations to z-scores if we know either the distribution mean and standard deviation. We call this transformed distribution a standardized distribution. Standardized distributions are used to make dissimilar distributions comparable. e.g., your height and weight One of the most common standardized distributions is the Z-distribution.

Properties of the z-score distribution m m transformation Xmean = 100 50 150 = 0

Properties of the z-score distribution m m transformation +1 X+1std = 150 50 150 Xmean = 100 = 0 = +1

Properties of the z-score distribution m m transformation -1 X-1std = 50 50 150 +1 Xmean = 100 = 0 X+1std = 150 = +1 = -1

Properties of the z-score distribution Shape - the shape of the z-score distribution will be exactly the same as the original distribution of raw scores. Every score stays in the exact same position relative to every other score in the distribution. Mean - when raw scores are transformed into z-scores, the mean will always = 0. The standard deviation - when any distribution of raw scores is transformed into z-scores the standard deviation will always = 1.

From z to raw score m m Z = -0.60 X = 70 X = (-0.60)( 50) + 100 We can also transform a z-score back into a raw score if we know the mean and standard deviation information of the original distribution. m 150 50 m +1 -1 transformation Z = -0.60 X = 70 X = (-0.60)( 50) + 100