Descriptive Statistics

Slides:



Advertisements
Similar presentations
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
Measures of Dispersion
Descriptive Statistics
Measures of Dispersion or Measures of Variability
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter Two Descriptive Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Measures of Dispersion
Describing Data: Numerical
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Numerical Descriptive Techniques
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
What is variability in data? Measuring how much the group as a whole deviates from the center. Gives you an indication of what is the spread of the data.
Chapter 2 Describing Data.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
INVESTIGATION 1.
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
Summary Statistics: Measures of Location and Dispersion.
Descriptive Statistics Tabular and Graphical Displays –Frequency Distribution - List of intervals of values for a variable, and the number of occurrences.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall2(2)-1 Chapter 2: Displaying and Summarizing Data Part 2: Descriptive Statistics.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
MODULE 3: DESCRIPTIVE STATISTICS 2/6/2016BUS216: Probability & Statistics for Economics & Business 1.
Descriptive Statistics(Summary and Variability measures)
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Descriptive Statistics ( )
Methods for Describing Sets of Data
Figure 2-7 (p. 47) A bar graph showing the distribution of personality types in a sample of college students. Because personality type is a discrete variable.
Statistical Methods Michael J. Watts
Business and Economics 6th Edition
ISE 261 PROBABILISTIC SYSTEMS
Statistical Methods Michael J. Watts
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Chapter 6 ENGR 201: Statistics for Engineers
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Averages and Variation
Statistical Reasoning
NUMERICAL DESCRIPTIVE MEASURES
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Descriptive Statistics
Descriptive and inferential statistics. Confidence interval
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Describing Data with Numerical Measures
Numerical Descriptive Measures
Quartile Measures DCOVA
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Statistics: The Interpretation of Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Business and Economics 7th Edition
Presentation transcript:

Descriptive Statistics Tabular and Graphical Displays Frequency Distribution - List of intervals of values for a variable, and the number of occurrences per interval Relative Frequency - Proportion (often reported as a percentage) of observations falling in the interval Histogram/Bar Chart - Graphical representation of a Relative Frequency distribution Stem and Leaf Plot - Horizontal tabular display of data, based on 2 digits (stem/leaf)

Comparing Groups Side-by-side bar charts 3 dimensional histograms Back-to-back stem and leaf plots Goal: Compare 2 (or more) groups wrt variable(s) being measured Do measurements tend to differ among groups?

Sample & Population Distributions Distributions of Samples and Populations- As samples get larger, the sample distribution gets smoother and looks more like the population distribution U-shaped - Measurements tend to be large or small, fewer in middle range of values Bell-shaped - Measurements tend to cluster around the middle with few extremes (symmetric) Skewed Right - Few extreme large values Skewed Left - Few extreme small values

Measures of Central Tendency Mean - Sum of all measurements divided by the number of observations (even distribution of outcomes among cases). Can be highly influenced by extreme values. Notation: Sample Measurements labeled Y1,...,Yn

Median, Percentiles, Mode Median - Middle measurement after data have been ordered from smallest to largest. Appropriate for interval and ordinal scales Pth percentile - Value where P% of measurements fall below and (100-P)% lie above. Lower quartile(25th), Median(50th), Upper quartile(75th) often reported Mode - Most frequently occurring outcome. Typically reported for ordinal and nominal data.

Measures of Variation Measures of how similar or different individual’s measurements are Range -- Largest-Smallest observation Deviation -- Difference between ith individual’s outcome and the sample mean: Variance of n observations Y1,...,Yn is the “average” squared deviation:

Measures of Variation Standard Deviation - Positive square root of the variance (measure in original units): Properties of the standard deviation: s  0, and only equals 0 if all observations are equal s increases with the amount of variation around the mean Division by n-1 (not n) is due to technical reasons (later) s depends on the units of the data (e.g. $1000s vs $)

Empirical Rule If the histogram of the data is approximately bell-shaped, then: Approximately 68% of measurements lie within 1 standard deviation of the mean. Approximately 95% of measurements lie within 2 standard deviations of the mean. Virtually all of the measurements lie within 3 standard deviations of the mean.

Other Measures and Plots Interquartile Range (IQR)-- 75th%ile - 25th%ile (measures the spread in the middle 50% of data) Box Plots - Display a box containing middle 50% of measurements with line at median and lines extending from box. Breaks data into four quartiles Outliers - Observations falling more than 1.5IQR above (below) upper (lower) quartile

Sample Statistics/Population Parameters Sample Mean and Standard Deviations are most commonly reported summaries of sample data. They are random variables since they will change from one sample to another. Population Mean (m) and Standard Deviation (s) computed from a population of measurements are fixed (unknown in practice) values called parameters.