Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.

Slides:



Advertisements
Similar presentations
Statistics for the Social Sciences Psychology 340 Fall 2006 Distributions.
Advertisements

Basic Statistical Concepts
Statistics for the Social Sciences
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
PSY 307 – Statistics for the Behavioral Sciences
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Introduction to Educational Statistics
1.2: Describing Distributions
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Data observation and Descriptive Statistics
Central Tendency and Variability
Chapter 3: Central Tendency
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Today: Central Tendency & Dispersion
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Summarizing Scores With Measures of Central Tendency
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
APPENDIX B Data Preparation and Univariate Statistics How are computer used in data collection and analysis? How are collected data prepared for statistical.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Chapters 1 & 2 Displaying Order; Central Tendency & Variability Thurs. Aug 21, 2014.
Reasoning in Psychology Using Statistics Psychology
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Review of SPSS basics & Displaying Distributions Statistics for the Social Sciences Psychology 340 Spring 2010.
Descriptive Statistics
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Warsaw Summer School 2014, OSU Study Abroad Program Variability Standardized Distribution.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
© 2006 McGraw-Hill Higher Education. All rights reserved. Numbers Numbers mean different things in different situations. Consider three answers that appear.
Central Tendency and Variability Chapter 4. Variability In reality – all of statistics can be summed into one statement: – Variability matters. – (and.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Descriptive Statistics: Presenting and Describing Data.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.
Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.
Psych 230 Psychological Measurement and Statistics Pedro Wolf September 9, 2009.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Introduction to statistics I Sophia King Rm. P24 HWB
Today: Standard Deviations & Z-Scores Any questions from last time?
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 4-6 Peer Tutor Slides Instructor: Mr. Ethan W. Cooper, Lead Tutor © 2013.
Chapter 2 Describing and Presenting a Distribution of Scores.
Measures of Central Tendency (MCT) 1. Describe how MCT describe data 2. Explain mean, median & mode 3. Explain sample means 4. Explain “deviations around.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Different Types of Data
Chapter 2: Methods for Describing Data Sets
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Central Tendency and Variability
Summarizing Scores With Measures of Central Tendency
Descriptive Statistics: Presenting and Describing Data
Displaying Distributions with Graphs
Statistics for the Social Sciences
Summary (Week 1) Categorical vs. Quantitative Variables
Summary (Week 1) Categorical vs. Quantitative Variables
Descriptive Statistics
Presentation transcript:

Describing Distributions Statistics for the Social Sciences Psychology 340 Spring 2010

PSY 340 Statistics for the Social Sciences Announcements Homework #1: will accept these on Th (Jan 21) without penalty Quiz problems –Quiz 1 is now posted, due date extended to Tu, Jan 26 th (by 11:00) Don’t forget Homework 2 is due Tu (Jan 26)

PSY 340 Statistics for the Social Sciences Outline (for week) Characteristics of Distributions –Finishing up using graphs –Using numbers (center and variability) Descriptive statistics decision tree Locating scores: z-scores and other transformations

PSY 340 Statistics for the Social Sciences Distributions Three basic characteristics are used to describe distributions –Shape Many different ways to display distribution –Frequency distribution table –Graphs –Center –Variability

PSY 340 Statistics for the Social Sciences Shapes of Frequency Distributions  Unimodal, bimodal, and rectangular

PSY 340 Statistics for the Social Sciences Shapes of Frequency Distributions  Symmetrical and skewed distributions  Normal and kurtotic distributions PositivelyNegatively

PSY 340 Statistics for the Social Sciences Frequency Graphs  Histogram  Plot the different values against the frequency of each value

PSY 340 Statistics for the Social Sciences Frequency Graphs  Histogram by hand  Step 1: make a frequency distribution table (may use grouped frequency tables)  Step 2: put the values along the bottom, left to right, lowest to highest  Step 3: make a scale of frequencies along left edge  Step 4: make a bar above each value with a height for the frequency of that value

PSY 340 Statistics for the Social Sciences Frequency Graphs  Histogram using SPSS (create one for class height)  Graphs -> Legacy -> histogram  Enter your variable into ‘variable’  To change interval width, double click the graph to get into the chart editor, and then double click the bottom axis. Click on ‘scale’ and change the intervals to desired widths  Note: you can also get one from the descriptive statistics frequency menu under the ‘charts’ option

PSY 340 Statistics for the Social Sciences Frequency Graphs  Frequency polygon - essentially the same, put uses lines instead of bars

PSY 340 Statistics for the Social Sciences Displaying two variables  Bar graphs  Can be used in a number of ways (including displaying one or more variables)  Best used for categorical variables  Scatterplots  Best used for continuous variables

PSY 340 Statistics for the Social Sciences Bar graphs Plot a bar graph of men and women in the class –Graphs -> bar –Simple, click define –N-cases (the default) –Enter Gender into Category axis, click ‘okay’

PSY 340 Statistics for the Social Sciences Bar graphs Plot a bar graph of shoes in closet crossed with men and women –What should we plot? (and why?) Average number of shoes for each group? –Graphs -> bar –Simple, click define –Other statistic (default is ‘mean’) – enter pairs of shoes –Enter Gender into Category axis, click ‘okay’

PSY 340 Statistics for the Social Sciences Scatterplot Useful for seeing the relationship between the variables –Graphs -> Legacy Dialogs –Scatter/Dot –Simple Scatter, click ‘define’ –Enter your X & Y variables, click ‘okay’ Can add a ‘fit line’ in the chart editor Plot a scatterplot of soda and bottled water drinking

PSY 340 Statistics for the Social Sciences Describing distributions Distributions are typically described with three properties: –Shape: unimodal, symmetric, skewed, etc. –Center: mean, median, mode –Spread (variability): standard deviation, variance

PSY 340 Statistics for the Social Sciences Describing distributions Distributions are typically described with three properties: –Shape: unimodal, symmetric, skewed, etc. –Center: mean, median, mode –Spread (variability): standard deviation, variance

PSY 340 Statistics for the Social Sciences Which center when? Depends on a number of factors, like scale of measurement and shape. –The mean is the most preferred measure and it is closely related to measures of variability –However, there are times when the mean isn’t the appropriate measure.

PSY 340 Statistics for the Social Sciences Which center when? Use the median if: The distribution is skewed The distribution is ‘open-ended’ –(e.g. your top answer on your questionnaire is ‘5 or more’) Data are on an ordinal scale (rankings) Use the mode if: –The data are on a nominal scale –If the distribution is multi-modal

PSY 340 Statistics for the Social Sciences The Mean The most commonly used measure of center The arithmetic average –Computing the mean – The formula for the population mean is (a parameter): – The formula for the sample mean is (a statistic): Add up all of the X’s Divide by the total number in the population Divide by the total number in the sample Note: your book uses ‘M’ to denote the mean in formulas

PSY 340 Statistics for the Social Sciences The Mean Number of shoes: –5, 7, 5, 5, 5 –30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Suppose we want the mean of the entire group? NO. Why not? Can we simply add the two means together and divide by 2?

PSY 340 Statistics for the Social Sciences The Weighted Mean Number of shoes: –5, 7, 5, 5, 5, 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Suppose we want the mean of the entire group? Can we simply add the two means together and divide by 2? NO. Why not?Need to take into account the number of scores in each mean

PSY 340 Statistics for the Social Sciences The Weighted Mean Number of shoes: –5, 7, 5, 5, 5, 30, 11, 12, 20, 14, 12, 15, 8, 6, 8, 10, 15, 25, 6, 35, 20, 20, 20, 25, 15 Let’s check: Both ways give the same answer

PSY 340 Statistics for the Social Sciences The median The median is the score that divides a distribution exactly in half. Exactly 50% of the individuals in a distribution have scores at or below the median. –Case1: Odd number of scores in the distribution Step1: put the scores in order Step2: find the middle score Step1: put the scores in order Step2: find the middle two scores Step3: find the arithmetic average of the two middle scores –Case2: Even number of scores in the distribution

PSY 340 Statistics for the Social Sciences The mode The mode is the score or category that has the greatest frequency. –So look at your frequency table or graph and pick the variable that has the highest frequency. so the mode is 5so the modes are 2 and 8 Note: if one were bigger than the other it would be called the major mode and the other would be the minor mode major mode minor mode

PSY 340 Statistics for the Social Sciences Describing distributions Distributions are typically described with three properties: –Shape: unimodal, symmetric, skewed, etc. –Center: mean, median, mode –Spread (variability): standard deviation, variance

PSY 340 Statistics for the Social Sciences Variability of a distribution Variability provides a quantitative measure of the degree to which scores in a distribution are spread out or clustered together. –In other words variabilility refers to the degree of “differentness” of the scores in the distribution. High variability means that the scores differ by a lot Low variability means that the scores are all similar

PSY 340 Statistics for the Social Sciences Standard deviation The standard deviation is the most commonly used measure of variability. –The standard deviation measures how far off all of the scores in the distribution are from the mean of the distribution. –Essentially, the average of the deviations. 

PSY 340 Statistics for the Social Sciences Computing standard deviation (population) Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, = -3  X - μ = deviation scores -3

PSY 340 Statistics for the Social Sciences Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, = = -1  X - μ = deviation scores Computing standard deviation (population)

PSY 340 Statistics for the Social Sciences Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution. Our population 2, 4, 6, = = = +1  X - μ = deviation scores 1 Computing standard deviation (population)

PSY 340 Statistics for the Social Sciences Step 1: Compute the deviation scores: Subtract the population mean from every score in the distribution. Our population 2, 4, 6, = = = = +3  X - μ = deviation scores 3 Notice that if you add up all of the deviations they must equal 0. Computing standard deviation (population)

PSY 340 Statistics for the Social Sciences Step 2: Get rid of the negative signs. Square the deviations and add them together to compute the sum of the squared deviations (SS). SS = Σ (X - μ) = = = = +3 X - σ = deviation scores = (-3) 2 + (-1) 2 + (+1) 2 + (+3) 2 = = 20 Computing standard deviation (population)

PSY 340 Statistics for the Social Sciences Step 3: Compute the Variance (the average of the squared deviations) Divide by the number of individuals in the population. variance = σ 2 = SS/N Computing standard deviation (population) Note: your book uses ‘SD 2 ’ to denote the variance in formulas

PSY 340 Statistics for the Social Sciences Step 4: Compute the standard deviation. Take the square root of the population variance. standard deviation = σ = Computing standard deviation (population) Note: your book uses ‘SD’ to denote the standard deviation in formulas

PSY 340 Statistics for the Social Sciences To review: –Step 1: compute deviation scores –Step 2: compute the SS SS = Σ (X - μ) 2 –Step 3: determine the variance take the average of the squared deviations divide the SS by the N –Step 4: determine the standard deviation take the square root of the variance Computing standard deviation (population)

PSY 340 Statistics for the Social Sciences The basic procedure is the same. –Step 1: compute deviation scores –Step 2: compute the SS –Step 3: determine the variance This step is different –Step 4: determine the standard deviation Computing standard deviation (sample)

PSY 340 Statistics for the Social Sciences Computing standard deviation (sample) Step 1: Compute the deviation scores –subtract the sample mean from every individual in our distribution. Our sample 2, 4, 6, X - X = deviation scores = = = = +3 X

PSY 340 Statistics for the Social Sciences Step 2: Determine the sum of the squared deviations (SS). Computing standard deviation (sample) = = = = +3 = (-3) 2 + (-1) 2 + (+1) 2 + (+3) 2 = = 20 X - X = deviation scores SS = Σ (X - X) 2 Apart from notational differences the procedure is the same as before

PSY 340 Statistics for the Social Sciences Step 3: Determine the variance Computing standard deviation (sample) Population variance = σ 2 = SS/N Recall:  X 1 X 2 X 3 X 4 The variability of the samples is typically smaller than the population’s variability

PSY 340 Statistics for the Social Sciences Step 3: Determine the variance Computing standard deviation (sample) Population variance = σ 2 = SS/N Recall: The variability of the samples is typically smaller than the population’s variability Sample variance = s 2 To correct for this we divide by (n-1) instead of just n

PSY 340 Statistics for the Social Sciences Step 4: Determine the standard deviation standard deviation = s = Computing standard deviation (sample)

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations Change/add/delete a given score MeanStandard deviation changes –Changes the total and the number of scores, this will change the mean and the standard deviation

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –All of the scores change by the same constant. X old Change/add/delete a given score MeanStandard deviation Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –All of the scores change by the same constant. X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –All of the scores change by the same constant. X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –All of the scores change by the same constant. X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –All of the scores change by the same constant. –But so does the mean X new Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X old Change/add/delete a given score MeanStandard deviation changes Add/subtract a constant to each score changes

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –It is as if you just pick up the distribution and move it over, but the spread (variability) stays the same X new X old Change/add/delete a given score MeanStandard deviation changes No changechangesAdd/subtract a constant to each score

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations Change/add/delete a given score MeanStandard deviation Multiply/divide a constant to each score changes No changechangesAdd/subtract a constant to each score X = = +1 (-1) 2 (+1) 2 s =

PSY 340 Statistics for the Social Sciences Properties of means and standard deviations –Multiply scores by 2 Change/add/delete a given score MeanStandard deviation Multiply/divide a constant to each score changes No changechanges Add/subtract a constant to each score = = +2 (-2) 2 (+2) 2 s = X S old =1.41

PSY 340 Statistics for the Social Sciences Locating a score Where is our raw score within the distribution? –The natural choice of reference is the mean (since it is usually easy to find). So we’ll subtract the mean from the score (find the deviation score). –The direction will be given to us by the negative or positive sign on the deviation score –The distance is the value of the deviation score

PSY 340 Statistics for the Social Sciences Locating a score  X 1 = 162 X 2 = 57 X = +62 X = -43 Reference point Direction

PSY 340 Statistics for the Social Sciences Locating a score  X 1 = 162 X 2 = 57 X = +62 X = -43 Reference point Below Above

PSY 340 Statistics for the Social Sciences Transforming a score –The distance is the value of the deviation score However, this distance is measured with the units of measurement of the score. Convert the score to a standard (neutral) score. In this case a z-score. Raw score Population mean Population standard deviation

PSY 340 Statistics for the Social Sciences Transforming scores  X 1 = 162 X 2 = 57 X = X = A z-score specifies the precise location of each X value within a distribution. Direction: The sign of the z-score (+ or -) signifies whether the score is above the mean or below the mean. Distance: The numerical value of the z-score specifies the distance from the mean by counting the number of standard deviations between X and σ.

PSY 340 Statistics for the Social Sciences Transforming a distribution We can transform all of the scores in a distribution –We can transform any & all observations to z-scores if we know either the distribution mean and standard deviation. –We call this transformed distribution a standardized distribution. Standardized distributions are used to make dissimilar distributions comparable. –e.g., your height and weight One of the most common standardized distributions is the Z- distribution.

PSY 340 Statistics for the Social Sciences Properties of the z-score distribution  transformation = 0 X mean = 100

PSY 340 Statistics for the Social Sciences Properties of the z-score distribution  transformation X mean = 100 = 0 = +1 X +1std =

PSY 340 Statistics for the Social Sciences Properties of the z-score distribution  transformation X mean = 100 X +1std = 150 = 0 = +1 = -1 X -1std = 50 +1

PSY 340 Statistics for the Social Sciences Properties of the z-score distribution Shape - the shape of the z-score distribution will be exactly the same as the original distribution of raw scores. Every score stays in the exact same position relative to every other score in the distribution. Mean - when raw scores are transformed into z-scores, the mean will always = 0. The standard deviation - when any distribution of raw scores is transformed into z-scores the standard deviation will always = 1.

PSY 340 Statistics for the Social Sciences   +1 From z to raw score We can also transform a z-score back into a raw score if we know the mean and standard deviation information of the original distribution. transformation Z = X = (-0.60)( 50) X = 70

PSY 340 Statistics for the Social Sciences Why transform distributions? Known properties –Shape - the shape of the z-score distribution will be exactly the same as the original distribution of raw scores. Every score stays in the exact same position relative to every other score in the distribution. –Mean - when raw scores are transformed into z-scores, the mean will always = 0. –The standard deviation - when any distribution of raw scores is transformed into z-scores the standard deviation will always = 1. Can use these known properties to locate scores relative to the entire distribution –Area under the curve corresponds to proportions (or probabilities)