Statistics-MAT 150 Chapter 2 Descriptive Statistics Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N518 Phone: 7421 Office Hours: Tue/Thu 1:30-3:00pm
Characteristics of data Center middle or average value Variation measures the amount data values vary Distribution nature or shape of distribution of data Outliers values that are very far out Time changing characteristics of data in time Mnemonic: CVDOT - computer viruses destroy or terminate Observe data set and give intuitive examples of these characteristics!
Organizing Qualitative Data Qualitative data values can be organized by a frequency distribution A frequency distribution lists Each of the categories The frequency for each category Good practices in constructing bar graphs The horizontal scale The categories should be spaced equally apart The rectangles should have the same widths The vertical scale Should begin with 0 Should be incremented in reasonable steps Should go somewhat, but not significantly, beyond the largest frequency or relative frequency
blue, blue, green, red, red, blue, red, blue A simple data set is blue, blue, green, red, red, blue, red, blue A frequency table for this qualitative data is The most commonly occurring color is blue Color Frequency Blue 4 Green 1 Red 3 The relative frequencies are the proportions (or percents) of the observations out of the total Color Relative Frequency Blue .500 Green .125 Red .375
Bar graphs for our simple data (using Excel) 1) Frequency bar graph 2) Relative frequency bar graph Good practices in constructing bar graphs The horizontal scale The categories should be spaced equally apart The rectangles should have the same widths The vertical scale Should begin with 0 Should be incremented in reasonable steps Should go somewhat, but not significantly, beyond the largest frequency or relative frequency
An example side-by-side bar graph comparing educational attainment in 1990 versus 2003
A pie chart is a circle divided into sections, one for each category The area (angle) of each sector is proportional to the frequency of that category Pie charts are useful to show the relative proportions of each category, compared to the whole
Data
Organizing Quantitative Data: The Popular Displays Learning objectives Organize discrete data in tables Construct histograms of discrete data Organize continuous data in tables Construct histograms of continuous data Draw stem-and-leaf plots Draw dot plots Identify the shape of a distribution
Frequency distributions Lower class limits: 0,100,… Upper class limits: 99,… Class boundaries: numbers used to separate classes without gaps; 99.5, 199.5,… Class midpoints: center of class; 49.5, 149.5, … Class width: diference between two consecutive lower (or upper) class limits: 100
Constructing frequency distribution Decide on number of classes n : 5-20 Class width =(highest value-lowest value)/n Starting point: lowest data value of convenient lowest value (smaller) List lower class limits List upper class limits Tally data: count the data values falling in each class Q. How do you go from one to the other?
Cummulative Frequency distribution
Visualizing data Histogram A histogram is a bar graph in which the horizontal scale represents classes of data values and the vertical scale represents frequencies
(Relative) frequency histograms, polygons and ojives
Other ways of representing data Dot plot: find out what this is! Stem-and-leaf plot keep track of all your data only works in certain specific cases condensed stem-and-leaf plot
…and more ways of representing data Which of the following are represented in the data sheet given in class? Pareto charts Pie charts Time charts Napoleon’s chart Scatter plot
Napoleon’s campaign chart 1812
Class sheet page 1
Organizing and Summarizing Data Summary Summaries of qualitative data Frequency tables Bar graphs Summaries of quantitative data Histograms Pie graphs, time-series graphs, etc. Cumulative frequencies, ogives, etc.