1 Descriptive Statistics Chapter 3 MSIS 111 Prof. Nick Dedeke.

Slides:



Advertisements
Similar presentations
Math Qualification from Cambridge University
Advertisements

1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Descriptive Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Chapter 14 Analyzing Quantitative Data. LEVELS OF MEASUREMENT Nominal Measurement Nominal Measurement Ordinal Measurement Ordinal Measurement Interval.
Intro to Descriptive Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Introductory Mathematics & Statistics
Chapter 11 Data Descriptions and Probability Distributions
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
2 Textbook Shavelson, R.J. (1996). Statistical reasoning for the behavioral sciences (3 rd Ed.). Boston: Allyn & Bacon. Supplemental Material Ruiz-Primo,
1 Measures of Central Tendency Greg C Elvers, Ph.D.
Descriptive Statistics Healey Chapters 3 and 4 (1e) or Ch. 3 (2/3e)
Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the.
Summarizing Scores With Measures of Central Tendency
Central Tendency.
CHAPTER 3 : DESCRIPTIVE STATISTIC : NUMERICAL MEASURES (STATISTICS)
Central Tendency Quantitative Methods in HPELS 440:210.
1 1 Slide © 2003 Thomson/South-Western. 2 2 Slide © 2003 Thomson/South-Western Chapter 3 Descriptive Statistics: Numerical Methods Part A n Measures of.
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
Descriptive Statistics: Numerical Methods
Chapter 2 Describing Data.
Chapter 4 – 1 Chapter 4: Measures of Central Tendency What is a measure of central tendency? Measures of Central Tendency –Mode –Median –Mean Shape of.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Measures of Central Tendency: The Mean, Median, and Mode
MEASURES OF CENTRAL TENDENCY The measures of central tendency are quantities that describe the “center” of a data set. These are also called AVERAGES.
Basic Statistical Terms: Statistics: refers to the sample A means by which a set of data may be described and interpreted in a meaningful way. A method.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Chapter 9 Statistics.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Central Tendency. Variables have distributions A variable is something that changes or has different values (e.g., anger). A distribution is a collection.
CHAPTER 3 : DESCRIPTIVE STATISTIC : NUMERICAL MEASURES (STATISTICS)
1 Descriptive Statistics Chapter 3 MSIS 111 Prof. Nick Dedeke.
Chapter SixteenChapter Sixteen. Figure 16.1 Relationship of Frequency Distribution, Hypothesis Testing and Cross-Tabulation to the Previous Chapters and.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
1 1 Slide Slides Prepared by JOHN S. LOUCKS St. Edward’s University © 2002 South-Western /Thomson Learning.
LIS 570 Summarising and presenting data - Univariate analysis.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 1 – Slide 1 of 27 Chapter 3 Section 1 Measures of Central Tendency.
Chapter 3 Descriptive Statistics: Numerical Methods.
Chapter 3 EXPLORATION DATA ANALYSIS 3.1 GRAPHICAL DISPLAY OF DATA 3.2 MEASURES OF CENTRAL TENDENCY 3.3 MEASURES OF DISPERSION.
Summarizing Data with Numerical Values Introduction: to summarize a set of numerical data we used three types of groups can be used to give an idea about.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Data Descriptions.
Statistics for Business
PRESENTATION OF DATA.
Descriptive Statistics
Chapter 3 Descriptive Statistics: Numerical Measures Part A
Topic 3: Measures of central tendency, dispersion and shape
Measures of Central Tendency
Quantitative Methods in HPELS HPELS 6210
Central Tendency and Variability
Summarizing Scores With Measures of Central Tendency
Numerical Measures: Centrality and Variability
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
MEASURES OF CENTRAL TENDENCY
LESSON 3: CENTRAL TENDENCY
St. Edward’s University
Find median-table data
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

1 Descriptive Statistics Chapter 3 MSIS 111 Prof. Nick Dedeke

2 Objectives Define measures of central tendency, variability, shape and association Define statistical measures Compute statistical measures for ungrouped and grouped data Interpret statistical results

3 Introduction In most competitive sports, one looks for the position of the athletes, e.g. who came in first, second, and so on. In statistics, one is interested in the following measures: - most frequent value in data set - summary of all values in data set - midpoint position of data set - positions of data in data set - distances to midpoint of data set

4 Exercise: Statistical Measure 1 We want to find out which of the following students is the better one using the available data. The data shows the positions of the two competitors in several rounds of testing. Kuli1 st 2 nd 1 st 2 nd 1 st 4 th 3 rd 3 rd 2 nd 5 th 1 st Marti 3 rd 2 nd 3 rd 1 st 2 nd 1 st 1 st 1 st 3 rd 2 nd 3 rd

5 Response: Commonsense Approach We want to find out which of the following students is the better one using the available data. Kuli 1 st 2 nd 1 st 2 nd 1 st 4 th 3 rd 3 rd 2 nd 5 th 1 st Marti 3 rd 2 nd 3 rd 2 nd 2 nd 1 st 1 st 1 st 3 rd 2 nd 1 st 3 times Kuli was 1 st Marti was behind 3 times Marti was 1 st Kuli was behind Marti had more 2 nd places Marti had more 3 rd places Imagine that you had a data set with 500 values!!

6 Mode The most frequently occurring value in a data set Applicable to all levels of data measurement (nominal, ordinal, interval, and ratio) Bimodal -- Data sets that have two modes Multimodal -- Data sets that contain more than two modes

7 Median Middle value in an ordered array of numbers Applicable for ordinal, interval, and ratio data Not applicable for nominal data Unaffected by extremely large and extremely small values

8 Median: Computational Procedure First Procedure Arrange the observations in an ordered array. If there is an odd number of terms, the median is the middle term of the ordered array. If there is an even number of terms, the median is the average of the middle two terms. Second Procedure The median’s position in an ordered array is given by (n+1)/2.

9 Median: Odd Number Example (Long method) Ordered Array There are 17 terms in the ordered array. Position of median = (n+1)/2 = (17+1)/2 = 9 The median is the 9th term, which is 15. If the 22 is replaced by 100, the median is 15. If the 3 is replaced by -103, the median is 15.

10 Median: Even Number Example (Long Method) Ordered Array There are 16 terms in the ordered array. Position of median = (n+1)/2 = (16+1)/2 = 8.5 The median is between the 8th and 9th terms, NOTE If the 21 is replaced by 100, the median is If the 3 is replaced by -88, the median is 14.5.

11 Arithmetic Mean Commonly called ‘the mean’ Is the average of a group of numbers Applicable for interval and ratio data Not applicable for nominal or ordinal data Affected by each value in the data set, including extreme values Computed by summing all values in the data set and dividing the sum by the number of values in the data set

12 Population Mean (Long method) Data for total population: 57, 57, 86, 86, 42, 42, 43, 56, 57, 42, 42, 43

13 Computing Sample Mean (Long method) Population mean is not the same thing as sample mean! Our numbers (57, 86, 42) is as sample that is drawn from the population and hence it is a small segment of it.

14 Computing Central Tend. Measures using Frequency Tables (Compact method) Mean=  F i *X i  F i = 1655/15 = XiXi FiFi F i * X i  Mode= 125 Median position = = (15+1)/2 = 8th Median value = 125 THIS IS THE TYPE APPROACH YOU NEED TO MASTER FOR YOUR EXAM. Data for total population: 55, 55, 60, 100, 100, 100, 125, 125, 125, 125, 125, 140, 140, 140, 140

15 Exercise: Computing Central Tend. Measures using Frequency Tables Mean=  F i *X i  F i= XiXi FiFi F i * X i  n=14 Mode= Median position = = Median value =

16 Response: Computing Central Tend. Measures using Frequency Tables Mean=  F i *X i  F i = 82/14 =5.85 XiXi FiFi F i * X i  n=1482 Mode= 6 and 4 Median position = = (14+1)/2 = 7.5 (between 7 th and 8 th ) Median value = = (6+6)/2 = 6

17 Opening Exercise: Using Statistical Measures Kuli 1 st 2 nd 1 st 2 nd 1 st 4 th 3 rd 3 rd 2 nd 5 th 1 st Marti 3 rd 2 nd 3 rd 2 nd 2 nd 1 st 1 st 1 st 3 rd 2 nd 1 st Mode: Most frequently occurring value of variable Mode for Kuli: 1 st Mode for Marti: 1 st Mean: Average of the values of a variable Sample mean =  X i n Mean or average score for Kuli 25/11 = 2.27 Mean or average score for Marti 21/11 = 1.9

18 Using Statistical Measures Kuli 1 st 2 nd 1 st 2 nd 1 st 4 th 3 rd 3 rd 2 nd 5 th 1 st Marti 3 rd 2 nd 3 rd 2 nd 2 nd 1 st 1 st 1 st 3 rd 2 nd 1 st Median: The value in the middle of an ordered data set of n values. Median point = (n + 1)/2 = (11+ 1)/2 = 6th position Kuli 1 st 1 st 1 st 1 st 2 nd 2 nd 2 nd 3 rd 3 rd 4 th 5 th Marti 1 st 1 st 1 st 1 st 2 nd 2 nd 2 nd 2 nd 3 rd 3 rd 3 rd Median score for Kuli is 2 nd Median score for Marti is 2 nd Notice median requires ordered set

19 Using Frequency Distribution Tables Analysis of Kuli’s performance Mean =  F i * X i  F i = 25/11 = 2.27 Mode = 1 st Median point = (11+ 1)/2 = 6 th Median value = 2 nd Using cumul. Freq. column = 2 nd XiXi Frequency (F i ) F i * X i Cum. (C F i ) 1 st nd rd th th 1511  25

20 Using Frequency Distribution Tables Analysis of Marti’s performance Mean =  F i * X i  F i = 21/11 = 1.9 Mode = 1 st & 2 nd Median point = (11+ 1)/2 = 6 th Median value = 2 nd Using cumul. Freq. column = 2 nd XiXi Frequency (F i ) F i * X i Cum. (C F i ) 1 st nd rd th th 000  1121

Using Frequency Distribution Tables Who is the better student? XiXi MartiKuli Mean Median value2 nd Mode1 st & 2 nd 1 st

22 New Case: Median measure Analysis of Katie’s performance Mean =  F i * X i  F i = 31/12 = 2.58 Mode = 3 rd Median point = (12+ 1)/2 = 6.5 th > median value is between 6th and 7th positions Median value =(2 nd +3 rd )/2 = 2.5 th > Average of the 6 th and 7 th positions. XiXi Frequency (F i ) F i * X i Cum. (C F i ) 1 st nd rd th 1412  31

23 Examples

24 Percentiles Sometimes we are not analyzing several values from one person, but one value for several persons or objects. For example we have data from the performance of several fund managers for year We want to present the data in the form, XX manager is in the top 10 or tenth percentile or top 25 or 25 th percentile. The method used consists of three steps - organize data in ascending order - calculate location of percentile you want - identify the object in the percentile location from the data set

25 Interpretation: Percentiles If manager YY is in the tenth percentile of of a group, this means that at least 10% of everyone scored below manager YY and at most 90 % of everyone in the data set scored better than manager YY. If manager Pico is in the 95 th percentile of of a group, this means that at least 95 % of everyone in the data set scored below manager Pico and at most 5 % of everyone in the data set scored better than the manager.

26 Exercise: Percentiles for Known Values First name Fund performance Bill106% Jane109% Sven114% Larry116% Dub121% Anna122% Cole125% Salome129% In which percentile is Sven?

27 Deriving Percentiles with Cumulative Relative Frequency Approach for Observed Values First name Fund performance Bill106% Jane109% Sven114% Larry116% Dub121% Anna122% Cole125% Salome129% In which percentile is Sven? Fi Rel. fi 11/ N=8 Cum rel. fi Percentiles 1/8= th Percentile 2/8= th Percentile 3/8= th Percentile 4/8= th Percentile 5/8= th Percentile 6/8= th Percentile 7/8= th Percentile 8/8=1 100 th Percentile

28 Deriving Percentiles with Cumulative Relative Frequency Approach for Unobserved Values First name Fund performance Bill106% Jane109% Sven114% Larry116% Dub121% Anna122% Cole125% Salome129% What is the value of the 90 th percentile? Fi Rel. fi 11/ N=8 Cum fi Percentiles 1/ th Percentile 2/8 25 th Percentile 3/ th Percentile 4/8 50 th Percentile 5/ th Percentile 6/8 75 th Percentile 7/ th Percentile th Percentile

29 Computing Data Values When Given Percentile locations (Approximate method) 90 th percentile location i = (P/100) * N = 0.9 * 8 = 7.2 th position Result is not an integer, percentile position is ( ) rounded up to 8 th position. 90 th percentile value from tables = 129% This is an approximate method because the formula gives the same result for multiple percentiles: The approximate method gives the same result of 129% for 91 st, 92 nd, 93 rd, up to 100 th percentiles 50 th percentile location i = (P/100) * N = 0.5 * 8 = 4 th position 50 th percentile = (4 th value + 5 th value)/2 = ( )/2 = 118.5% (But from tables we see that 116% is also the 50 th percentile) RECOMMENDATION: USE THIS APPROXIMATE APPROACH FORMULA WHEN YOU ARE DEALING WITH UNOBSERVED VALUES. IF YOU USE THE APPROACH IN THE EXAM, YOU WILL NOT BE MARKED WRONG.

30 Computing Percentile locations with arithmetic formula (More precise method) 90 th percentile location i = (P/100) * N = 0.9 * 8 = 7.2 th position 90 th percentile is 0.2 or 20% between the 7 th and 8 th The value for the 90 th percentile is computed by computing the following values = 7 th position’s value + (8 th position’s value - 7 th position value)* Fraction got from computing i 125% + (129% - 125%)*0.2 = 125.8% (~ 126%) 50 th percentile location i = (P/100) * N = 0.5 * 8 = 4 th position 50 th percentile = 116%

31 Overview Measures and Summary of Conditions for Using Descriptive Measures The use of statistical measures is conditioned on the level of measurement of data. For specific levels, e.g. nominal level, many statistical measures can not be used.

32 Descriptive Measures for Grouped Data Mean, Median and Mode can all be computed for quantitative data sets, that were measured at the right level.

33 Class intervalFrequency (F i ) Midpoints (M i ) [1 – 3) inches162 [3 – 5) inches24 [5 – 7) inches46 [7 – 9) inches38 [9 – 11) inches910 [11 – 13) inches612  40 Exercise: Central Tendency Measures for Grouped Data Modal class: Median position: Median class:

34 Class intervalFrequency (F i ) Midpoint (M i ) [1 – 3) inches162 [3 – 5) inches24 [5 – 7) inches46 [7 – 9) inches38 [9 – 11) inches910 [11 – 13) inches612  40 Response: Central Tendency Measures for Grouped Data Modal class: [1 – 3) inches Median position: (n+1)/2 = 41/2 =20.5 between 20 th and 21 st positions Median class:[5-7) inches (this would be hard to derive if it were between 18th and 19 th positions, i.e. it crossed two classes)

35 Class intervalFrequency (F i ) Midpoint (M i ) (F i )*(M i ) [1 – 3) inches16232 [3 – 5) inches248 [5 – 7) inches4624 [7 – 9) inches3824 [9 – 11) inches91090 [11 – 13) inches61272  Example: Central Tendency Measures for Grouped Data Find the mean for the distribution: Mean: = (Σ F i *M i )/n = 226/40 = 5.65 inches

36 Class intervalFrequency (F i ) Midpoint (M i ) (F i )*(M i ) [1 – 2) inches2 [2 – 3) inches2 [3 – 4) inches4 [4 – 5) inches2 [5 – 6) inches1  Exercise: Central Tendency Measures for Grouped Data Find the mean for the distribution: Mean: = (Σ F i *M i )/n = inches

37 Class intervalFrequency (F i ) Midpoint (M i ) (F i )*(M i ) [1 – 2) inches20.51 [2 – 3) inches22.55 [3 – 4) inches [4 – 5) inches24.59 [5 – 6) inches15.5  Response: Central Tendency Measures for Grouped Data Find the mean for the distribution: Mean: = (Σ F i *M i )/n = 34.5/11 = inches

38 Excel Examples