# Organizing Data Looking for Patterns and departures from them.

## Presentation on theme: "Organizing Data Looking for Patterns and departures from them."— Presentation transcript:

Organizing Data Looking for Patterns and departures from them

Exploring Data Introduction Displays Descriptions

What is Statistics? A science and not MATH An examination with clear explanations and not just crunching numbers A WHODUNIT: who, what, and why Delving into the 411 of groups of individuals according to variables

Definitions Individuals are the objects described by a set of data Variable is any characteristic of an individual Categorical: non – numeric groups/classes (name, sex, city). Quantitative: numerical values (age, scores, mileage).

Whodunit Who? Individuals and how many What? Number of variables, the names and identifying the units associated with them. Why? Reason data gathered and its intended purpose. Is the data to support or refute?

Distribution The distribution of a variable tells us what values the variable takes and how often.

Exercise 1.1: Fuel – Efficient Cars A) Individuals: Make and model of 1998 motor vehicles B) Vehicle type – categorical Transmission type – categorical Number of Cyl. – quantitative Mileage rate in city (mpg) – quantitative Highway mileage (mpg) - quantitative

Assignment Page 7: 1.2 – 1.4

1.1 Displaying Distributions Displays and graphs help to place the written text in a more visible form. All good displays have a title and axes are labeled and equal intervals are used when appropriate.

Categorical Variables Categorical variables are best displayed with bar graphs or pie charts. Bar graphs: quick comparisons. Pie charts or pie graphs: show parts of the whole (percentages used)

Quantitative Variables Quantitative variables are best displayed by dotplots and stemplots (double stemplots) Keep these features in mind Shape – mound, skewed left/right Center - median Spread – smallest and largest values Outliers - unusual features

Dotplots and Stemplots Read the construction of these. Look quickly at our choices of soft drinks from Table 1.1. Construct a dotplot and stemplot for the caffeine content. Complete exercises 1.5 – 1.8.

Histograms When we have many values for our quantitative variable and those values can be grouped together to get a clearer picture of the distribution. Steps: Arrange the data into equal widths called classes, receive a count within those classes(height of bar), label and scale axes. Activity (tech): Presidential Ages at Inauguration.

Activity: Getting to Know You Due: Friday, August 27, 2010 Select a display for your data: bar graph, circle graph, line graph, histogram, stemplot, dotplot. Be as creative as you can in addition to drawing or using Excel or any other type of display technology.

Activity: Getting to Know You Write a report on the data collected following the guidelines. Quantitative Data: Discuss the shape of your distribution, center, spread and any unusual features (outliers). What inferences can you make about your class? Categorical Data: Write about any observations you can draw about your class.

Frequencies and Percentiles Sometimes it is interesting to describe the relative position of an individual within a distribution. pth percentile is the value such that p percent of the observations fall at or below p. An ogive or relative cumulative frequency graph allows us to see the distribution as a whole.

Presidents Look at the middle of page 29. Frequency – count Relative frequency – percentage of those falling within a certain group (class) Ogive allows us to look at individuals compared to the whole ( percentiles related to all involved.)

Assignment Exercises 1.12

Time Plots Variable plotted against the time it was measured. Time is always marked on the horizontal axis and the variable of interest on the vertical axis. Connecting the points help us to see trends.

Assignment Exercises: 1.21, 1.23, 1.25 – 1.29