Numerical Descriptive Techniques

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Descriptive Statistics
Calculating & Reporting Healthcare Statistics
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Numerical Descriptive Techniques
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Created by Tom Wegleitner, Centreville, Virginia Section 3-1.
Slides by JOHN LOUCKS St. Edward’s University.
Basic Business Statistics 10th Edition
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter 3, Part 1 Descriptive Statistics II: Numerical Methods
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Numerical Descriptive Techniques
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
Copyright © 2009 Cengage Learning 4.1 Day 5 Numerical Descriptive Techniques.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
1 Tendencia central y dispersión de una distribución.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 – Descriptive Statistics
Methods for Describing Sets of Data
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
LECTURE 8 Thursday, 19 February STA291 Fall 2008.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Chapter 2 Describing Data.
© 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Variation This presentation should be read by students at home to be able to solve problems.
Chapter Four Numerical Descriptive Techniques Sir Naseer Shahzada.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Business Statistics, A First Course.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Statistics Lecture Notes Dr. Halil İbrahim CEBECİ Chapter 03 Numerical Descriptive Techniques.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
Copyright © 2009 Cengage Learning 4.1 Chapter Four Numerical Descriptive Techniques.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 2 Describing Data: Numerical
Yandell – Econ 216 Chap 3-1 Chapter 3 Numerical Descriptive Measures.
Business and Economics 6th Edition
Numerical Descriptive Techniques
Ch 4 實習.
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Keller: Stats for Mgmt & Econ, 7th Ed Numerical Descriptive Techniques
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Keller: Stats for Mgmt & Econ, 7th Ed
Descriptive Statistics
Keller: Stats for Mgmt & Econ, 7th Ed Numerical Descriptive Techniques
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
St. Edward’s University
Business and Economics 7th Edition
Presentation transcript:

Numerical Descriptive Techniques Statistics for Management and Economics Chapter 4

Objectives Measures of Central Location Measures of Variability Measures of Relative Standing and Box Plots Measures of Linear Relationship Graphical vs. Numerical Techniques Data Exploration

Measures of Central Location Numerical measure of the center, or middle, of the data Arithmetic Mean Median Mode Geometric Mean

Measure of Center: Arithmetic Mean The arithmetic mean, a.k.a. average, shortened to mean, is the most popular & useful measure of central location. Appropriate for describing interval data. The arithmetic mean for a population is denoted with Greek letter “mu”:  The arithmetic mean for a sample is denoted with an “x-bar”: It is computed by simply adding up all the observations and dividing by the total number of observations: Sum of the observations Number of observations Mean =

Arithmetic Mean Population Mean Sample Mean

Measure of Center: Median The median is another useful measure of central location. Appropriate for describing interval or ordinal data. Best measure of central location when dealing with data that has extreme values. Computed the same for population and sample. Calculated by placing all the observations in order; the observation that falls in the middle is the median. HINT! The middle of the dataset falls at the location (n+1)/2

Measures of Center: Mode The mode of a set of observations is the value that occurs most frequently. A set of data may have one mode (or modal class), or two, or more modes. Mode is a useful measure of center for all data types, though mainly used to identify the group with the highest frequency for nominal data. For large data sets the modal class is much more relevant than a single-value mode. Computed the same for population and sample.

Mean vs. Median vs. Mode Mean and median for a symmetric distribution Left skew or Negative skew Mean Median Mode Right skew or Positive skew Mean Median Mode Mean and median for skewed distributions

Mean vs. Median vs. Mode Symmetric distribution: the mean, median, and mode will be approximately the same. Multimodal distribution: report the mean, median and/or mode for each subgroup. Nominal data: Mode calculation is useful for determining highest frequency but not “central location”; the calculation of the mean is not valid. Ordinal data: Median is appropriate; the calculation of the mean is not valid. Interval data: Mean is appropriate; in the case of skewed data, report the median as well.

Measures of Center: Geometric Mean The geometric mean is used when the variable is a growth rate or rate of change, such as the value of an investment over periods of time. If Ri denotes the rate of return in period i (i = 1, 2, …, n), then The geometric mean Rg of the returns R1, R2, … Rn is defined such that:

Measures of Center: Geometric Mean Solving for Rg we produce the following formula: The upper case Greek Letter “Pi” represents a product of terms…

Measures of Center: Summary Use the… Mean Median Mode Geometric Mean To describe… The central location of a single set of interval data the central location of a single set of interval or ordinal data a single set of nominal data a single set of interval data based on growth rates

Measures of Variability Tell how variable, or spread out, the data falls around the mean Used in conjunction with measures of center to describe a distribution with numbers Used primarily for interval data Three measures: Range Variance Standard Deviation

Measures of Variability: Range Simplest measure of variability, easily computed Calculated as: largest observation – smallest observation Not very descriptive of the variability of the data – how?

Measures of Variability: Variance Widely used Used to summarize data but also plays an important role in statistical inference In general, explains how data is spread about the mean. For the population, denoted by the lower case Greek letter sigma (squared): 2 For the sample, s2

Measures of Variability: Variance The variance of a population is: The variance of a sample is: population mean population size sample mean Note! the denominator is sample size (n) minus one !

Shortcut: Calculating Variance A short-cut formulation to calculate sample variance directly from the data without the intermediate step of calculating the mean…

Measures of Variability: Standard Deviation Square root of the variance Population: Sample:

Interpretation: Standard Deviation Together with the sample mean, the standard deviation can be used to “build” the picture of a distribution. It can also be used to compare the variability of different distributions. To do this, we can use… The Empirical Rule Chebysheff’s Theorem

Interpretation: The Empirical Rule For distributions with bell shaped histograms. States that… Approximately 68% of all observations fall within one standard deviation of the mean. Approximately 95% of all observations fall within two standard deviations of the mean. Approximately 99.7% of all observations fall within three standard deviations of the mean. A.K.A. The 68% - 95% - 99.7% Rule

Interpretation: Chebysheff’s Theorem Applies to all shapes of histograms (not limited to bell shaped) The proportion of observations in any sample that lie within k standard deviations of the mean is at least: Note: The Empirical Rule provides approximate proportions given the limits where Chebysheff provides the lower bound on the proportions.

Measures of Variability: Coefficient of Variation The coefficient of variation of a set of observations is the standard deviation of the observations divided by their mean, that is: Population coefficient of variation = CV = Sample coefficient of variation = cv = This coefficient provides a proportionate measure of variation (thus is useful for comparing variation among two datasets). For example, a standard deviation of 10 may be perceived as large when the mean value is 100, but only moderately large when the mean value is 500.

Measures of Relative Standing Measures of relative standing are designed to provide information about the position of particular values relative to the entire data set. Percentile: the Pth percentile is the value for which P percent are less than that value and (100-P)% are greater than that value. Specifically, the 25%, 50%, and 75% percentiles are Quartiles. You may also see fifths – Quintiles or tenths – Deciles.

Percentiles and Quartiles… First (lower) decile = 10th percentile First (lower) quartile, Q1 = 25th percentile Second (middle) quartile,Q2 = 50th percentile Third quartile, Q3 = 75th percentile Ninth (upper) decile = 90th percentile

Locating Percentiles The following formula allows us to approximate the location of any percentile:

Measures of Variability: Interquartile Range Interquartile Range (IQR) = Q3 – Q1 The interquartile range measures the spread of the middle 50% of the observations. Large values of this statistic mean that the 1st and 3rd quartiles are far apart indicating a high level of variability. Usually reported with the Median (M)

Graphical Description of the Quartiles: The Boxplot Sometimes also called a box-and-whisker plot Uses the Five Number Summary: Minimum Q1 M Q3 Maximum The “box” shows the center of the data and the general shape, the “whiskers” show the spread of the data If the data extends beyond the whiskers of the plot, there are outliers in the dataset, therefore, this is a good summary of data with outliers! You can easily create side-by-side boxplots to compare multiple groups

The Boxplot 1.5(Q3 – Q1) S Q1 Q2 Q3 L Whisker Whiskers are calculated as 1.5(Q3-Q1). In the plot above, there is an outlier at the largest value (L) Boxplots mimic the general shape of the distribution.

Measures of Linear Relationship Numerical measures of linear relationship that provide information as to the strength & direction of a linear relationship (if any) between two variables. Covariance - is there any pattern to the way two variables move together? Coefficient of correlation - how strong is the linear relationship between two variables?

Measures of Linear Relationship: Covariance population mean of variable X, variable Y sample mean of variable X, variable Y Note: divisor is n-1, not n as you may expect.

Measures of Linear Relationship: Covariance There is also a shortcut for calculating sample covariance directly from the data:

Interpretation: Covariance When two variables move in the same direction (both increase or both decrease), the covariance will be a large positive number. When two variables move in opposite directions, the covariance is a large negative number. When there is no particular pattern, the covariance is a small number.

Measures of Linear Relationship: Correlation The Coefficient of Correlation (a.k.a., the correlation) is the covariance divided by the standard deviations of the variables Greek letter “rho” From the correlation, we can determine the strength, direction, and linearity of the association between X and Y. The correlation is the “numerical scatterplot”

Interpretation: Correlation The advantage of the coefficient of correlation over covariance is that it has fixed range from -1 to +1, thus: If the two variables are very strongly positively related, the coefficient value is close to +1 (strong positive linear relationship). If the two variables are very strongly negatively related, the coefficient value is close to -1 (strong negative linear relationship). No straight line relationship is indicated by a coefficient close to zero.

Measures of Linear Relationship: Least Squares Method Recall, the slope-intercept equation for a line is expressed in these terms: y = mx + b Where: m is the slope of the line b is the y-intercept. If we’ve determined that a linear relationship exists, can we determine a linear function?

Measures of Linear Relationship: Least Squares Method …produces a straight line drawn through the points so that the sum of squared deviations between the points and the line is minimized. This line is represented by the equation: y-intercept slope Estimated value of y determined by the line Value of x data (usually given)