Introduction to Descriptive Statistics 17.871 Spring 2007.

Slides:



Advertisements
Similar presentations
Introduction to Descriptive Statistics Key measures Describing data MomentNon-mean based measure Center MeanMode, median Spread Variance (standard.
Advertisements

EXPLORING DATA WITH GRAPHS AND NUMERICAL SUMMARIES
Chapter 1 Data Presentation Statistics and Data Measurement Levels Summarizing Data Symmetry and Skewness.
Exploratory Data Analysis (Descriptive Statistics)
Descriptive Statistics
BASIC STATISTICAL TOOLS
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
Introduction to Descriptive Statistics Spring 2006.
QM Spring 2002 Statistics for Decision Making Descriptive Statistics.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Analysis of Research Data
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Chapter Two Descriptive Statistics McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 4: The Normal Distribution and Z-Scores.
Data observation and Descriptive Statistics
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Examining Univariate Distributions Chapter 2 SHARON LAWNER WEINBERG SARAH KNAPP ABRAMOWITZ StatisticsSPSS An Integrative Approach SECOND EDITION Using.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Programming in R Describing Univariate and Multivariate data.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
With Statistics Workshop with Statistics Workshop FunFunFunFun.
EPE/EDP 557 Key Concepts / Terms –Empirical vs. Normative Questions Empirical Questions Normative Questions –Statistics Descriptive Statistics Inferential.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Basic Descriptive Statistics Percentages and Proportions Ratios and Rates Frequency Distributions: An Introduction Frequency Distributions for Variables.
Graphical Summary of Data Distribution Statistical View Point Histograms Skewness Kurtosis Other Descriptive Summary Measures Source:
Methods for Describing Sets of Data
2011 Summer ERIE/REU Program Descriptive Statistics Igor Jankovic Department of Civil, Structural, and Environmental Engineering University at Buffalo,
CHAPTER 7: Exploring Data: Part I Review
Chapter 2 Characterizing Your Data Set Allan Edwards: “Before you analyze your data, graph your data.
Univariate Data Chapters 1-6. UNIVARIATE DATA Categorical Data Percentages Frequency Distribution, Contingency Table, Relative Frequency Bar Charts (Always.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Introduction to Descriptive Statistics Objectives: 1.Explain the general role of statistics in assessment & evaluation 2.Explain three methods for describing.
Chapter 2 Describing Data.
Descriptive Statistics becoming familiar with the data.
Biostatistics Class 1 1/25/2000 Introduction Descriptive Statistics.
Skewness & Kurtosis: Reference
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
Univariate EDA. Quantitative Univariate EDASlide #2 Exploratory Data Analysis Univariate EDA – Describe the distribution –Distribution is concerned with.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
The field of statistics deals with the collection,
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Descriptive Statistics – Graphic Guidelines
LIS 570 Summarising and presenting data - Univariate analysis.
Scales of Measurement n Nominal classificationlabels mutually exclusive exhaustive different in kind, not degree.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house.
1 Day 1 Quantitative Methods for Investment Management by Binam Ghimire.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
Graphs with SPSS Aravinda Guntupalli. Bar charts  Bar Charts are used for graphical representation of Nominal and Ordinal data  Height of the bar is.
Copyright © 2009 Pearson Education, Inc. 3.2 Picturing Distributions of Data LEARNING GOAL Be able to create and interpret basic bar graphs, dotplots,
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Descriptive Statistics – Graphic Guidelines Pie charts – qualitative variables, nominal data, eg. ‘religion’ Bar charts – qualitative or quantitative variables,
14.6 Descriptive Statistics (Graphical). 2 Objectives ► Data in Categories ► Histograms and the Distribution of Data ► The Normal Distribution.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
COMPLETE BUSINESS STATISTICS
Methods for Describing Sets of Data
EHS 655 Lecture 4: Descriptive statistics, censored data
EXPLORATORY DATA ANALYSIS and DESCRIPTIVE STATISTICS
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Module 6: Descriptive Statistics
Description of Data (Summary and Variability measures)
Univariate Descriptive Statistics
Laugh, and the world laughs with you. Weep and you weep alone
Introduction to Descriptive Statistics
Univariate Statistics
Warm Up # 3: Answer each question to the best of your knowledge.
Review for Exam 1 Ch 1-5 Ch 1-3 Descriptive Statistics
Presentation transcript:

Introduction to Descriptive Statistics Spring 2007

First, Some Words about Graphical Presentation Aspects of graphical integrity (following Edward Tufte, Visual Display of Quantitative Information)  Represent number in direct proportion to numerical quantities presented  Write clear labels on the graph  Show data variation, not design variation  Deflate and standardize money in time series

Population vs. Sample Notation PopulationVsSample GreeksRomans , ,  s, b

Types of Variables Nominal (Qualitative) “categorical” ~Nominal (Quantitative) Ordinal Interval or ratio

Describing data MomentNon-mean based measure CenterMeanMode, median SpreadVariance (standard deviation) Range, Interquartile range SkewSkewness-- PeakedKurtosis--

Mean

Variance, Standard Deviation

Variance, S.D. of a Sample Degrees of freedom

The z-score or the “standardized score”

Skewness Symmetrical distribution IQ SAT “No skew” “Zero skew” Symmetrical

Skewness Asymmetrical distribution GPA of MIT students “Negative skew” “Left skew”

Skewness (Asymmetrical distribution) Income Contribution to candidates Populations of countries “Residual vote” rates “Positive skew” “Right skew”

Skewness

Kurtosis leptokurtic platykurtic mesokurtic

A few words about the normal curve Skewness = 0 Kurtosis = 3

More words about the normal curve 34% 47% 49%

Binary data

Commands in STATA for getting univariate statistics summarize varname summarize varname, detail histogram varname, bin() start() width() density/fraction/frequency normal graph box varnames tabulate [NB: compare to table]

Example of Sophomore Test Scores High School and Beyond, 1980: A Longitudinal Survey of Students in the United States (ICPSR Study 7896) totalscore = % of questions answered correctly minus penalty for guessing recodedtype = (1=public school, 2=religious private, 3 = non-sectarian private)

Explore totalscore some more. table recodedtype,c(mean totalscore) recodedty | pe | mean(totals~e) | | |

Graph totalscore. hist totalscore

Divide into “bins” so that each bar represents 1% correct hist totalscore,width(.01) (bin=124, start= , width=.01)

Add ticks at each 10% mark histogram totalscore, width(.01) xlabel(-.2 (.1) 1) (bin=124, start= , width=.01)

Superimpose the normal curve (with the same mean and s.d. as the empirical distribution). histogram totalscore, width(.01) xlabel(-.2 (.1) 1) normal (bin=124, start= , width=.01)

Do the previous graph by school types.histogram totalscore, width(.01) xlabel(-.2 (.1)1) by(recodedtype) (bin=124, start= , width=.01)

Main issues with histograms Proper level of aggregation Non-regular data categories

A note about histograms with unnatural categories From the Current Population Survey (2000), Voter and Registration Survey How long (have you/has name) lived at this address? -9 No Response -3 Refused -2 Don't know -1 Not in universe 1 Less than 1 month months months years years 6 5 years or longer

Solution, Step 1 Map artificial category onto “natural” midpoint -9 No Response  missing -3 Refused  missing -2 Don't know  missing -1 Not in universe  missing 1 Less than 1 month  1/24 = months  3.5/12 = months  9/12 = years  years  years or longer  10 (arbitrary)

Graph of recoded data histogram longevity, fraction

Density plot of data Total area of last bar =.557 Width of bar = 11 (arbitrary) Solve for: a = w h (or).557 = 11h => h =.051

Density plot template CategoryFractionX-minX-maxX-length Height (density) < 1 mo / * 1-6 mo /12½ mo..0430½ yr yr yr * =.0156/.082

Draw the previous graph with a box plot. graph box totalscore Upper quartile Median Lower quartile } Inter-quartile range } 1.5 x IQR

Draw the box plots for the different types of schools. graph box totalscore,by(recodedtype)

Draw the box plots for the different types of schools using “over” option graph box totalscore,over(recodedtype)

Three words about pie charts: don’t use them

So, what’s wrong with them For non-time series data, hard to get a comparison among groups; the eye is very bad in judging relative size of circle slices For time series, data, hard to grasp cross- time comparisons