BCOR 1020 Business Statistics Lecture 4 – January 29, 2008.

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Class Session #2 Numerically Summarizing Data
Numerically Summarizing Data
Descriptive Statistics
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Calculating & Reporting Healthcare Statistics
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
Intro to Descriptive Statistics
Biostatistics Unit 2 Descriptive Biostatistics 1.
Slides by JOHN LOUCKS St. Edward’s University.
Prepared by Lloyd R. Jaisingh
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Measures of Central Tendency
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Describing Data: Numerical
Describing distributions with numbers
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 – Descriptive Statistics
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
4A-1. Descriptive Statistics (Part 1) Numerical Description Numerical Description Central Tendency Central Tendency Dispersion Chapter 4A4A McGraw-Hill/Irwin©
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Descriptive Statistics (Part 1) Chapter44 Numerical Description Central Tendency Dispersion McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies,
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
Descriptive Statistics: Numerical Methods
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Created by Tom Wegleitner, Centreville, Virginia Section 2-4 Measures of Center.
Describing distributions with numbers
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Lecture 3 Describing Data Using Numerical Measures.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Univariate Descriptive Statistics Heibatollah Baghi, and Mastee Badii George Mason University.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
According to researchers, the average American guy is 31 years old, 5 feet 10 inches, 172 pounds, works 6.1 hours daily, and sleeps 7.7 hours. These numbers.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Central Tendency & Dispersion
4A-1. Descriptive Statistics (Part 1) Numerical Description Numerical Description Central Tendency Central Tendency Dispersion Chapter 4A4A McGraw-Hill/Irwin©
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Summary Statistics: Measures of Location and Dispersion.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
E Spring Chapter 3 Summary Statistics. 2 Measures of Central Location/Central Tendency Mean, Median, Mode Measures of Variability/Dispersion Range,
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
CHAPTER 2: Basic Summary Statistics
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Data Description Chapter 3. The Focus of Chapter 3  Chapter 2 showed you how to organize and present data.  Chapter 3 will show you how to summarize.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Descriptive Statistics (Part 1) Numerical Description Numerical Description Central Tendency Central Tendency Dispersion Chapter 44.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Chapter 3 Describing Data Using Numerical Measures
Descriptive Statistics
Central Tendency and Variability
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Description of Data (Summary and Variability measures)
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

BCOR 1020 Business Statistics Lecture 4 – January 29, 2008

Overview Chapter 4 – Descriptive Statistics… –Numerical Description –Central Tendency –Dispersion

Chapter 4 – Numerical Description Population (Size = N): Characterized by Parameters e.g.,  = pop. Mean,  = pop. Std. dev. Sample (Size = n): Statistics are computed and estimate parameters e.g., = sample mean, S = sample std. dev. Recall: Statistics are descriptive measures derived from a sample (n items). Parameters are descriptive measures derived from a population (N items).

Chapter 4 – Numerical Description There are three key characteristics of numerical data: CharacteristicInterpretation Central Tendency Where are the data values concentrated? What seem to be typical or middle data values? Dispersion How much variation is there in the data? How spread out are the data values? Are there unusual values? Shape Are the data values distributed symmetrically? Skewed? Sharply peaked? Flat? Bimodal?

Chapter 4 – Numerical Description Example: Vehicle Quality Consider the data set of vehicle defect rates from J. D. Power and Associates. Numerical statistics can be used to summarize this random sample of brands. Must allow for sampling error since the analysis is based on sampling. Defect rate = total no. defects no. inspected x 100

Chapter 4 – Numerical Description Number of defects per 100 vehicles, 2004 models.

Chapter 4 – Numerical Description Sorted data provides insight into central tendency and dispersion.

Chapter 4 – Numerical Description Visual Displays: The dot plot offers a visual impression of the data. Histograms with 5 bins (suggested by Sturges’ Rule) and 10 bins are shown below. Both are symmetric with no extreme values and show a modal class toward the low end.

Chapter 4 – Numerical Description We can compute descriptive statistics using Excel and discuss measures of central tendency and dispersion… –Figures 4.4 and 4.5 in your text details the Excel menus for computing descriptive statistics. –Figure 4.7 in your text details the MegaStat menus for computing descriptive statistics.

Chapter 4 – Numerical Description MegaStat output…

Chapter 4 – Central Tendency The central tendency is the middle or typical values of a distribution. Central tendency can be assessed using a dot plot, histogram or more precisely with numerical statistics. The Text presents six measures of central tendency… –Mean– Median –Mode– Midrange –Geometric Mean (G)– Trimmed Mean The mean and median are the most frequently used, but we will discuss the merits of all six.

Chapter 4 – Central Tendency Mean – A familiar measure of central tendency. In Excel, use function =AVERAGE(Data) where Data is an array of data values. For the sample of n = 37 car brands: Population FormulaSample Formula

Chapter 4 – Central Tendency Characteristics of the Mean: Arithmetic mean is the most familiar average. Affected by every sample item. The balancing point or fulcrum for the data. Regardless of the shape of the distribution, distances from the mean to the data points always sum to zero.

Chapter 4 – Central Tendency Median (M) – the 50 th percentile or midpoint of the sorted sample data. Use Excel’s function =MEDIAN(Data) where Data is an array of data values. M separates the upper and lower half of the sorted observations. –If n is even, the median is the average of the middle two observations in the data array. –If n is odd, the median is the middle observation in the data array.

Chapter 4 – Central Tendency Median: To compute the median by hand, sort the n observations in the data:To compute the median by hand, sort the n observations in the data: For even n, Median = For odd n, Median = where

Chapter 4 – Central Tendency Example: Consider the following n = 6 data values: What is the median? M = (x 3 +x 4 )/2 = (15+17)/2 = 16 For even n, Median = n/2 = 6/2 = 3 and n/2+1 = 6/2 + 1 = 4

Clickers Consider the following n = 7 data values: What is the median? A = 24 B = 25 C = 26 D = 27

Chapter 4 – Central Tendency Median For the 37 vehicle quality ratings (odd n) the position of the median is (n+1)/2 = (37+1)/2 = 19. So, the median is x 19 = 121. When there are several duplicate data values, the median does not provide a clean “50-50” split in the data.

Chapter 4 – Central Tendency Characteristics of the Median The median is insensitive to extreme data values. For example, consider the following quiz scores for 3 students: What does the median for each student tell you? Tom’s scores: 20, 40, 70, 75, 80 Mean =57, Median = 70, Total = 285 Jake’s scores: 60, 65, 70, 90, 95 Mean = 76, Median = 70, Total = 380 Mary’s scores: 50, 65, 70, 75, 90 Mean = 70, Median = 70, Total = 350

Chapter 4 – Central Tendency Mode – The most frequently occurring data value. Similar to mean and median if data values occur often near the center of sorted data. May have multiple modes or no mode. Easy to define, not easy to calculate in large samples. Use Excel’s function =MODE(Array) –will return #N/A if there is no mode. –will return first mode found if multimodal. May be far from the middle of the distribution and not at all typical. Generally isn’t useful for continuous data since data values rarely repeat. –Best for attribute data or a discrete variable with a small range (e.g., Likert scale).

Chapter 4 – Central Tendency Mode: A bimodal distribution refers to the shape of the histogram rather than the mode of the raw data. Occurs when dissimilar populations are combined in one sample. For example,

Chapter 4 – Central Tendency Skewness: Compare mean and median or look at histogram to determine degree of skewness. Mean, Median & Skewness: If median > mean, skewed left. If median = mean, symmetric. If median < mean, skewed right. Mean, Mode & Skewness: If mode > mean, skewed left. If mode = mean, symmetric. If mode < mean, skewed right.

Chapter 4 – Central Tendency Midrange – the point halfway between the lowest and highest values of X. Easy to use but sensitive to extreme data values. Midrange =

Clickers Consider the J. D. Power quality data (n=37): What is the midrange? A = 121B = 122 C = 130D = 173

Chapter 4 – Central Tendency Trimmed Mean: To calculate the trimmed mean, first remove the highest and lowest k percent of the observations. To determine how many observations to trim, multiply k x n: –Remove (k x n) highest and lowest observations. Mitigates the effects of extreme values. May exclude relevant data values.

Chapter 4 – Dispersion Variation is the “spread” of data points about the center of the distribution in a sample. The text considers the following measures of dispersion: –Range –Variance (S 2 ) –Standard Deviation (S) –Coefficient of Variation (CV) –Mean Absolute Deviation (MAD) The variance and standard deviation are the most frequently used, but we will briefly discuss the merits of all five.

Chapter 4 – Dispersion Range – The difference between the largest and smallest observation. Easy to calculate, but sensitive to extreme data values. Range = x max – x min

Chapter 4 – Dispersion Variance: The population variance (  2 ) is defined as the sum of squared deviations around the mean  divided by the population size. For the sample variance (s 2 ), we divide by n – 1 instead of n, otherwise s 2 would tend to underestimate the unknown population variance  2.

Chapter 4 – Dispersion Standard Deviation – The square root of the variance. Explains how individual values in a data set vary from the mean. Units of measure are the same as X. For the 37 vehicle quality ratings … Population standard deviation Sample standard deviation

Chapter 4 – Dispersion

Calculating Standard Deviation: Excel’s built in functions are… The standard deviation is nonnegative because deviations around the mean are squared. When every observation is exactly equal to the mean, the standard deviation is zero. Standard deviations can be large or small, depending on the units of measure. Compare standard deviations only for data sets measured in the same units and only if the means do not differ substantially. Statistic Excel population formula Excel sample formula Variance=VARP(Array)=VAR(Array) Standard deviation =STDEVP(Array)=STDEV(Array)

Chapter 4 – Dispersion Coefficient of Variation – A unit-free measure of dispersion. Expressed as a percent of the mean. Useful for comparing variables measured in different units or with different means. Only appropriate for nonnegative data. It is undefined if the mean is zero or negative.

Clickers Recall from the J. D. Power quality data (n=37): What is the Coefficient of Variation ? A = 5.48% B = 18.26% C = 22.89% D = %

Chapter 4 – Dispersion Mean Absolute Deviation (MAD) – reveals the average distance from an individual data point to the mean (center of the distribution). Uses absolute values of the deviations around the mean. Excel’s function is =AVEDEV(Array).

Chapter 4 – Dispersion Consider the histograms of hole diameters drilled in a steel plate during manufacturing. The desired distribution is outlined in red. Machine A Machine B Central Tendency vs. Dispersion: Manufacturing Desired mean (5mm) but too much variation. Acceptable variation but mean is less than 5 mm.