Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part 1 – Data Presentation Statistics and Data Analysis.

Similar presentations


Presentation on theme: "Part 1 – Data Presentation Statistics and Data Analysis."— Presentation transcript:

1 Part 1 – Data Presentation Statistics and Data Analysis

2 Part 1 – Data Presentation Statistics and Data Analysis Part 1 – Data Presentation

3 3 Data Presentation Agenda Data and Data Types Representing Data: pie chart, bar chart. Summarizing Data: box plot, histogram Central tendency Spread Distribution (shape) 1/29

4 Part 1 – Data Presentation 4 Data = A Set of Facts A picture of some aspect of the world Pizza Sales by Type 2/29 What do the data tell you? How can you use the information? What additional information would make these data more informative?

5 Part 1 – Data Presentation 5 A More Complicated Set of Facts: What story do the data tell? 3/29

6 Part 1 – Data Presentation 6 Data Types and Measurement Univariate vs. Multivariate Quantitative Discrete = count: Number of shootings by city by time Continuous = measurement: Housing prices Qualitative Categorical: Shopping mall, car brand, trip mode Ordinal: Survey data on attitudes; “How do you feel about…?” Strongly disagree  Disagree  Neutral  Agree  Strongly agree Moody’s bond ratings: Aaa, Aa, A, Bbb, Bb, B, and so on. Frameworks Cross section Time series Longitudinal 4/29

7 Part 1 – Data Presentation 7 Univariate vs. Multivariate Univariate: Count of pizzas is the single variable. Multivariate: Numerous Variables 5/29

8 Part 1 – Data Presentation 8 Discrete Data – US Crime Statistics; Counts of Occurrences. 6/29

9 Part 1 – Data Presentation 9 Continuous Data Housing Prices and Incomes 7/29

10 Part 1 – Data Presentation 10 Unordered Qualitative Data Travel Mode by 210 Travelers* 8/29 * Note: Not computed with Minitab

11 Part 1 – Data Presentation 11 Ordered Qualitative Data: German Health Satisfaction Survey; 5,831 Women. On a scale from 0 to 10, how do you feel about your health?* HEALTH SATISFACTION N = 5831 Response Frequency ================== 0 97 1 52 2 147 3 287 4 346 5 935 6 631 7 924 8 1329 9 626 10 457 9/29 * Note: Not computed with Minitab

12 Part 1 – Data Presentation 12 Problems with Ordered Survey Response Data SafetyCountPercentCum Pct 11727.87 21524.5952.46 31727.8780.33 41016.3996.72 523.28100.00 61 Stern Students’ Ranking of Subway Safety (1994)* Very Unsatisfactory Unsatisfactory OK Satisfactory Very Satisfactory Jeff Simonoff: Data Presentation and Summary, pp. 3-4 10/29

13 Part 1 – Data Presentation 13 Quantitative vs. Qualitative Data Qualitative Data: No units of measurement Arithmetic manipulation is usually meaningless. The average of Air and Bus is not Train Quantitative Data: Units of measurement make sense. Arithmetic computations make sense. 11/29

14 Part 1 – Data Presentation 14 Cross Section Data Housing Prices and Incomes 13/29

15 Part 1 – Data Presentation 15 Time Series Data: Car Thefts 14/29

16 Part 1 – Data Presentation 16 Longitudinal Data: 3 Year Survey: Satisfaction on a scale from 0 to 5. 15/29

17 Part 1 – Data Presentation 17 Representing Data In raw form Transformed to a visual form Summarized graphically Summarized statistically 16/29

18 Part 1 – Data Presentation 18 Housing Prices and Incomes 17/29

19 Part 1 – Data Presentation 19 Housing Price Data Visual Representation www.trulia.com/home_prices/ 18/29

20 Part 1 – Data Presentation 20 Pie Chart Pizza Pies Sold, by Type 19/20

21 Part 1 – Data Presentation 21 Data Representation Same data. Which is easier to understand? 20/29 BAR CHART PIE CHART

22 Part 1 – Data Presentation 22 A Box Plot Describes the Distribution of Values in a Set of Data 21/29 Hawaii What is an outlier? Why do we believe a particular point is an outlier? Box and Whisker Plot for House Price Listings

23 Part 1 – Data Presentation 23 Making a Box Plot for Per Capita Income Maximum=31136 Median =22610 Minimum=17043 1 st Quartile =21677 (approx) 3 rd Quartile =24933 (approx.) Interquartile Range = IQR =24933-21677 =3256 22/29

24 Part 1 – Data Presentation 24 A Frequency Distribution 24/29

25 Part 1 – Data Presentation 25 Histogram for House Price Listings 25/29 HOG, pp. 16-18 A histogram describes the sample data and suggests the nature of the underlying data generating process. Note the “skewness” of the distribution of listings.

26 Part 1 – Data Presentation 26 Distribution of House Price Listings 26/29 Asymmetry (skewness) in the histogram of listing prices… Shows up in the box and whisker plot. Note the long whisker at the top of the figure.

27 Part 1 – Data Presentation 27 More than One Group in A Histogram* N F = 14243 N M = 13083 27/29 * Note: Not computed with Minitab

28 Part 1 – Data Presentation 28 Summary What story does the data presentation tell? Data in raw form tell no story. Visual representation of data tells something about the data Data reduction and summary representation: What do we learn? Location Spread Shape of the distribution What tool is most informative? Reduction to a small number of features Visual displays of data Pie chart Box and whisker plots Histograms Time series plots “There are lies, damned lies and statistics.” (Benjamin Disraeli) 29/29


Download ppt "Part 1 – Data Presentation Statistics and Data Analysis."

Similar presentations


Ads by Google