Methods for Describing Sets of Data

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

Descriptive Statistics
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter Two Descriptive Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements
Programming in R Describing Univariate and Multivariate data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 2 Describing Data with Numerical Measurements General Objectives: Graphs are extremely useful for the visual description of a data set. However,
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Numerical Descriptive Techniques
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Chapter 3 Averages and Variations
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 2: Methods for Describing Sets of Data
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Table of Contents 1. Standard Deviation
Chapter 2 Describing Data.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Lecture 3 Describing Data Using Numerical Measures.
14.1 Data Sets: Data Sets: Data set: collection of data values.Data set: collection of data values. Frequency: The number of times a data entry occurs.Frequency:
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Chapter 8 Making Sense of Data in Six Sigma and Lean
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Descriptive Statistics – Graphic Guidelines
2-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 18.
COMPLETE BUSINESS STATISTICS
Prof. Eric A. Suess Chapter 3
Exploratory Data Analysis
Math 201: Chapter 2 Sections 3,4,5,6,7,9.
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
2.5: Numerical Measures of Variability (Spread)
CHAPTER 2: Describing Distributions with Numbers
Chapter 2: Methods for Describing Data Sets
Descriptive Statistics: Numerical Methods
Chapter 6 ENGR 201: Statistics for Engineers
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Statistical Reasoning
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
Numerical Descriptive Measures
Descriptive Statistics
Descriptive Statistics: Numerical Methods
DAY 3 Sections 1.2 and 1.3.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Statistics: The Interpretation of Data
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
Honors Statistics Review Chapters 4 - 5
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
NUMERICAL DESCRIPTIVE MEASURES
Presentation transcript:

Methods for Describing Sets of Data Chapter 2 Methods for Describing Sets of Data Slides for Optional Sections Section 2.8 Methods for Detecting Outliers Slides 31-34 Section 2.9 Graphing Bivariate Relationships Slide 35 Section 2.10 The Time Series Plot Slide 36

Objectives Describe Data using Graphs Describe Data using Charts

Describing Qualitative Data Qualitative data are nonnumeric in nature Best described by using Classes 2 descriptive measures class frequency – number of data points in a class class relative = class frequency frequency total number of data points in data set class percentage – class relative freq. x 100 Add discussion of class percentage

Describing Qualitative Data – Displaying Descriptive Measures Summary Table Class Frequency Class percentage – class relative frequency x 100

Describing Qualitative Data – Qualitative Data Displays Bar Graph

Describing Qualitative Data – Qualitative Data Displays Pie chart

Describing Qualitative Data – Qualitative Data Displays Pareto Diagram

Graphical Methods for Describing Quantitative Data The Data

Graphical Methods for Describing Quantitative Data For describing, summarizing, and detecting patterns in such data, we can use three graphical methods: dot plots stem-and-leaf displays histograms

Graphical Methods for Describing Quantitative Data Dot Plot

Graphical Methods for Describing Quantitative Data Stem-and-Leaf Display

Graphical Methods for Describing Quantitative Data Histogram

Graphical Methods for Describing Quantitative Data More on Histograms Number of Observations in Data Set Number of Classes Less than 25 5-6 25-50 7-14 More than 50 15-20

Summation Notation Used to simplify summation instructions Each observation in a data set is identified by a subscript x1, x2, x3, x4, x5, …. xn Notation used to sum the above numbers together is

Summation Notation Data set of 1, 2, 3, 4 Are these the same? and

Numerical Measures of Central Tendency Central Tendency – tendency of data to center about certain numerical values 3 commonly used measures of Central Tendency: Mean Median Mode

Numerical Measures of Central Tendency The Mean Arithmetic average of the elements of the data set Sample mean denoted by Population mean denoted by Calculated as and

Numerical Measures of Central Tendency The Median Middle number when observations are arranged in order Median denoted by m Identified as the observation if n is odd, and the mean of the and observations if n is even

Numerical Measures of Central Tendency The Mode The most frequently occurring value in the data set Data set can be multi-modal – have more than one mode Data displayed in a histogram will have a modal class – the class with the largest frequency

Numerical Measures of Central Tendency The Data set 1 3 5 6 8 8 9 11 12 Mean Median is the or 5th observation, 8 Mode is 8

Numerical Measures of Variability Variability – the spread of the data across possible values 3 commonly used measures of Variability: Range Variance Standard Deviation

Numerical Measures of Variability The Range Largest measurement minus the smallest measurement Loses sensitivity when data sets are large These 2 distributions have the same range. How much does the range tell you about the data variability?

Numerical Measures of Variability The Sample Variance (s2) The sum of the squared deviations from the mean divided by (n-1). Expressed as units squared Why square the deviations? The sum of the deviations from the mean is zero

Numerical Measures of Variability The Sample Standard Deviation (s) The positive square root of the sample variance Expressed in the original units of measurement

Numerical Measures of Variability Samples and Populations - Notation Sample Population Variance s2 Standard Deviation s

Numerical Measures of Relative Standing Descriptive measures of relationship of a measurement to the rest of the data Common measures: percentile ranking z-score

Numerical Measures of Relative Standing Percentile rankings make use of the pth percentile The median is an example of percentiles. Median is the 50th percentile – 50 % of observations lie above it, and 50% lie below it For any p, the pth percentile has p% of the measures lying below it, and (100-p)% above it

Numerical Measures of Relative Standing z-score – the distance between a measurement x and the mean, expressed in standard units Use of standard units allows comparison across data sets

Numerical Measures of Relative Standing More on z-scores Z-scores follow the empirical rule for mounded distributions

Methods for Detecting Outliers Outlier – an observation that is unusually large or small relative to the data values being described Causes: Invalid measurement Misclassified measurement A rare (chance) event 2 detection methods: Box Plots z-scores

Methods for Detecting Outliers Box Plots based on quartiles, values that divide the dataset into 4 groups Lower Quartile QL – 25th percentile Middle Quartile - median Upper Quartile QU – 75th percentile Interquartile Range (IQR) = QU - QL

Methods for Detecting Outliers Box Plots Not on plot – inner and outer fences, which determine potential outliers QU (hinge) QL (hinge) Median Potential Outlier Whiskers

Methods for Detecting Outliers Rules of thumb Box Plots measurements between inner and outer fences are suspect measurements beyond outer fences are highly suspect Z-scores Scores of 3 in mounded distributions (2 in highly skewed distributions) are considered outliers

Graphing Bivariate Relationships Bivariate relationship – the relationship between two quantitative variables Graphically represented with the scattergram

The Time Series Plot Time Series Data – data produced and monitored over time Graphically represented with the time series plot Time on x axis Order on x axis

Summary Graphical methods for Qualitative Data Pie chart Bar graph Pareto diagram Graphical methods for Quantitative Data Dot plot Stem-and-leaf display Histogram

Summary Numerical measures of central tendency Mean Median Mode Numerical measures of variation Range Variance Standard Deviation

Summary Measures of relative standing Methods for detecting Outliers Percentile ranking z-scores Methods for detecting Outliers Box plots Method for graphing the relationship between two quantitative variables Scatterplot