Descriptive statistics (Part I)

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

© 2002 Prentice-Hall, Inc.Chap 2-1 Basic Business Statistics (8 th Edition) Chapter 2 Presenting Data in Tables and Charts.
Chapter 2 Presenting Data in Tables and Charts
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
Chapter 2 Presenting Data in Tables and Charts
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Basic Business Statistics (10th Edition)
Organizing Numerical Data Numerical Data Ordered Array Stem and Leaf Display Frequency Distributions Cumulative Distributions Histograms.
Descriptive Statistics A.A. Elimam College of Business San Francisco State University.
1 Pertemuan 01 PENDAHULUAN: Data dan Statistika Matakuliah: I0262-Statiatik Probabilitas Tahun: 2007.
Ch. 2: The Art of Presenting Data Data in raw form are usually not easy to use for decision making. Some type of organization is needed Table and Graph.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 2 Graphs, Charts, and Tables – Describing Your Data
Chapter 2 Describing Data Sets
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 2-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Business 90: Business Statistics
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter Two Treatment of Data.
1 Pertemuan 02 Ukuran Numerik Deskriptif Matakuliah: I0262-Statistik Probabilitas Tahun: 2007.
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Edpsy 511 Homework 1: Due 2/6.
Coefficient of Variation
© 2003 Prentice-Hall, Inc.Chap 3-1 Business Statistics: A First Course (3 rd Edition) Chapter 3 Numerical Descriptive Measures.
2 - 1 © 2001 Prentice-Hall, Inc. Statistics for Business and Economics Methods for Describing Sets of Data Chapter 2.
QM 1 - Intro to Quant Methods Graphical Descriptive Statistics Charts and Tables Dr. J. Affisco.
Alok Srivastava Chapter 2 Describing Data: Graphs and Tables Basic Concepts Frequency Tables and Histograms Bar and Pie Charts Scatter Plots Time Series.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 What is a Frequency Distribution? A frequency distribution is a list or a.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 2-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
Modified by ARQ, from © 2002 Prentice-Hall.Chap 3-1 Numerical Descriptive Measures Chapter %20ppts/c3.ppt.
Lecture 2 Graphs, Charts, and Tables Describing Your Data
Basic Business Statistics Chapter 2:Presenting Data in Tables and Charts Assoc. Prof. Dr. Mustafa Yüzükırmızı.
Descriptive Statistics Roger L. Brown, Ph.D. Medical Research Consulting Middleton, WI Online Course #1.
Chap 2-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 2 Describing Data: Graphical Statistics for Business and Economics.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Chapter 2 Describing Data.
Descriptive Statistics1 LSSG Green Belt Training Descriptive Statistics.
Lecture 3 Describing Data Using Numerical Measures.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 2-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
© 2003 Prentice-Hall, Inc.Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Basic Business Statistics (9 th Edition)
Areej Jouhar & Hafsa El-Zain Biostatistics BIOS 101 Foundation year.
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
© 2002 Prentice-Hall, Inc.Chap 1-1 Basic Business Statistics (8 th Edition) Introduction and Data Collection.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Statistics For Managers 4 th.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 2-1 Week 2 Presenting Data in Tables and Charts Statistical Methods.
Chap 2-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course in Business Statistics 4 th Edition Chapter 2 Graphs, Charts, and Tables.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
Applied Quantitative Analysis and Practices
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 2-1 Chapter 2 Presenting Data in Tables and Charts Basic Business Statistics 11 th Edition.
BUSINESS MATHEMATICS & STATISTICS. LECTURE 26 Review Lecture 25 Statistical Representation Measures of Dispersion and Skewness Part 1.
1 David Kilgour Presenting Data in Tables and Charts.
Descriptive Statistics
Chapter 2 Describing Data: Graphical
Chapter 3 Describing Data Using Numerical Measures
BUSINESS MATHEMATICS & STATISTICS.
Descriptive Statistics
Description of Data (Summary and Variability measures)
Chapter 3 Describing Data Using Numerical Measures
Chapter 2 Presenting Data in Tables and Charts
Organizing and Visualizing Variables
Statistics: The Interpretation of Data
Probability and Statistics
PENGOLAHAN DAN PENYAJIAN Presenting Data
PENGOLAHAN DAN PENYAJIAN Presenting Data
Presentation transcript:

Descriptive statistics (Part I) Lecture 2 Descriptive statistics (Part I)

Lecture 2: Descriptive statistics Data in raw form are usually not easy to use for decision making Some type of organization is needed Table Graph Techniques reviewed here: Bar charts and pie charts Ordered array Stem-and-leaf display Frequency distributions, histograms Cumulative distributions Contingency tables

Tabulating and Graphing Univariate Categorical Data Graphing Data Tabulating Data Pie Charts Summary Table Bar Charts

Summary Table (for an Investor’s Portfolio) Investment Category Amount Percentage (in thousands $) Stocks 46.5 42.27 Bonds 32 29.09 CD 15.5 14.09 Savings 16 14.55 Total 110 100 Variables are Categorical

Bar Chart (for an Investor’s Portfolio)

Pie Chart (for an Investor’s Portfolio) Amount Invested in K$ Savings 15% Stocks 42% CD 14% Percentages are rounded to the nearest percent Bonds 29%

Organizing Numerical Data 41, 24, 32, 26, 27, 27, 30, 24, 38, 21 Frequency Distributions & Cumulative Distributions Ordered Array Stem and Leaf Display 2 144677 3 028 4 1 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Histograms Tables

The Ordered Array Shows range (min to max) Data in raw form (as collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38 Data in ordered array from smallest to largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Shows range (min to max) May help identify outliers (unusual observations) If the data set is large, the ordered array is less useful

Stem-and-Leaf Display A simple way to see distribution details in a data set METHOD: Separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves)

Example Data in Raw Form (as Collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38 Data in Ordered Array from Smallest to Largest: 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Stem-and-Leaf Display: 2 1 4 4 6 7 7 3 0 2 8 4 1

Tabulating Numerical Data: Frequency Distributions What is a Frequency Distribution? A frequency distribution is a list or a table … containing class groupings (ranges within which the data fall) ... and the corresponding frequencies with which data fall within each grouping or category It allows for a quick visual interpretation of the data

Tabulating Numerical Data: Frequency Distributions Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27

Sort Raw Data on days in Ascending Order 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find Range: 58 - 12 = 46 Select Number of Classes: 5 (usually between 5 and 15) Compute Class Interval (Width): 10 (46/5 then round up) Determine Class Boundaries (Limits):10, 20, 30, 40, 50, 60 Count Observations & Assign to Classes

Frequency Distributions, Relative Frequency Distributions and Percentage Distributions Data in Ordered Array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Relative Frequency Percentage Class Frequency [10, 20) 3 .15 15 [20, 30) 6 .30 30 [30, 40) 5 .25 25 [40, 50) 4 .20 20 [50, 60) 2 .10 10 Total 20 1 100

Graphing Numerical Data: The Histogram A graph of the data in a frequency distribution is called a histogram The class boundaries (or class midpoints) are shown on the horizontal axis the vertical axis is either frequency, relative frequency, or percentage Bars of the appropriate heights are used to represent the number of observations within each class

Histogram Example (No gaps between bars) Class Midpoints Frequency [10, 20) 15 3 [20, 30) 25 6 [30, 40) 35 5 [40, 50) 45 4 [50, 60) 55 2 (No gaps between bars) Class Midpoints

Tabulating Numerical Data: Cumulative Frequency Data in Ordered Array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Upper Cumulative Cumulative Limit Frequency % Frequency 10 0 0 20 3 15 30 9 45 40 14 70 50 18 90 60 20 100

Two categorical variables (contingency table) The following data represent the responses to a question asked in a survey of 20 college students majoring in business – What is your gender? (Male = M; Female = F) What is your major? (Accountancy = A; Information System = I; Market = M) Gender: M M M F M F F M F M F M M M M F F M F F Major: A I I M A I A A I I A A A M I M A A A I

Contingency table (cont’d) Raw data set: Gender: M M M F M F F M F M F M M M M F F M F F Major: A I I M A I A A I I A A A M I M A A A I A I M Total Male 6 4 1 11 Female 3 2 9 10 7 20

Graphical methods are: Good in presenting data Not easy for comparison Difficult to use for statistical inference

Numerical description Summary Measures Central Tendency (location measures) Quartiles Variation Range Mean Median Mode Variance Interquartile range Standard Deviation

Mean Mean (Arithmetic Mean) of Data Values Sample mean Population mean Sample Size Population Size

An example TV watching hours/week: 5, 7, 3, 38, 7 Mean = (5 + 7 + 3 + 38 + 7)/5 = 60/5 = 12 If the correct time for 4th subject is 8 (not 38) Mean = (5 + 7 + 3 + 8 + 7)/5 = 30/5 = 6 3 5 6 7 8 3 5 7 12 38 Mean = 6 Mean = 12

Mean (Cont’d) The Most Common Measure of Central Tendency especially when n is large due to its good theoretical properties Affected by Extreme Values (Outliers)

Median Robust measure of central tendency Not affected by extreme values In an ordered array, the median is the ‘middle’ number If n is odd, the median is the middle number (i.e,(n+1)/2 th measurement) If n is even, the median is the average of the n/2 th and (n/2 +1) th measurement 3 5 7 38 3 5 7 8 Median = 7 Median = 7

Mode A Measure of Central Tendency Value that Occurs Most Often Not Affected by Extreme Values There May Not Be a Mode There May Be Several Modes Used for Either Numerical or Categorical Data 0 1 2 3 4 5 6 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 No Mode Mode = 9

Quartiles 25% 25% 25% 25% Split ordered data into 4 quarters Position of i-th quartile (1st quartile) and (3rd quartile) are measures of Noncentral Location are called 25th, 50th, and 75th percentile respectively. A pth percentile is the value of X such that p% of the measurements are less than X and (100-p)% are greater than X. 25% 25% 25% 25%

Quartiles (example) Data in Ordered Array: 3 6 6 12 12 12 15 15 18 21 Position of first quartile is Position of third quartile is

5-number summary Median( ) X X 3 6 12 15.75 21 Box-and-Whisker Plot Graphical display of data using 5-numbers Data in Ordered Array: 3 6 6 12 12 12 15 15 18 21 Median( ) X X largest smallest 3 6 12 15.75 21