Presentation on theme: "Chapter 15: Statistics Section 15.1: Formulating Statistical Questions, Gathering Data, and Using Samples."— Presentation transcript:
1Chapter 15: StatisticsSection 15.1: Formulating Statistical Questions, Gathering Data, and Using Samples
2Statistical Questions Statistical Questions: ones that can be answered by collecting and analyzing data (pieces of information, can be numerical or categorical)Ex’s: (a) What is the height of each student in our class?(b) How many home runs has each player hit in a baseball team’sstarting lineup?(c) What on campus restaurant is the favorite among UK students?(d) What is the average weight of watermelons sold at your localgrocery store?
3Types of Statistical Studies Observational Studies: observe characteristics or quantities without influencing these characteristics or quantitiesExperiments: try to determine factors that influence characteristics or quantities
4Gathering DataThe population is the full set of people or things that the study is designed to investigate.Ex’s: (a) the students in our class, (b) the players in the starting lineup, (c) the UK student body, (d) all of the watermelonsA sample of a population consists of some collection of members of the population.Samples need to be representative, i.e. the characteristics of the sample reflect those of the populationIdeal representative sample: random sample- every member of the population has an equal chance of being in the sample.
5Gathering DataSamples need to be representative, i.e. the characteristics of the sample reflect those of the populationIdeal representative sample: random sample- every member of the population has an equal chance of being in the sample.Ex: Asking random students in the Student Center at lunchtime about their favorite on campus restaurant is not random or representative.
6Using SamplesEx: If a random sampling of 250 students showed that 60 students said Panda Express was their favorite on campus restaurant, about how many of UK’s 20,000 students would be expected to say that Panda Express is their favorite?
8Displaying Categorical Data Real graph: display using real objects ingraph formEx: using Starburst candy or wrappersto display how many pieces there are ofeach color
9Displaying Categorical Data Pictograph: uses icons or pictures to display the dataEx: each small, coloredrectangle representsa piece of candy foreach flavor in onepackage of candy
10Displaying Categorical Data Bar Graph: uses a single rectangle for each category to display the dataEx: the height of each barrepresents the number ofa piece of candy for eachflavor
11Displaying Categorical Data Double Bar Graph: each category is subdivided into 2 smaller categoriesEx: Displays median weeklyearnings broken down by raceand further subdivided bygender.
12Displaying Categorical Data Pie Graph: uses a subdivided circle to show how data is partitioned into categoriesEx: Shows the percentage ofUK students that prefer thatparticular on campus restaurant
13Displaying Numerical Data Dot Plot: a pictograph with categories being numbers or intervals and the icons being dotsEx: The following displays the home runs hit by hitters in a baseball lineup, where the home runs hit were 12, 17, 25, 32, 24, 17, 12, 12, 8
14Displaying Numerical Data Ex: The following displays the home runs hit by hitters in a baseball lineup, where the home runs hit were 12, 17, 25, 32, 24, 17, 12, 12, 8
15Displaying Numerical Data Histogram: a bar graph with categories being numbers or intervalsEx: The number of studentsearning each letter grade onan exam is displayed.
16Displaying Numerical Data Stem and Leaf Plot: 2 columns in which the stem (left column) and leaf (right column) together form the data2 | 0572 = data includes 20, 25, 27, and 22Ex: Displays the exam scores from the previous histogram526669371878278958263
17Displaying Numerical Data Line Graphs: data points are plotted and adjacent points are connected by line segments, used for continuously varying dataEx: Displays the U.S. popu-lation each decade over the20th century.
18Displaying Numerical Data Scatterplot: collection of data points in a plane, shows how 2 kinds of data are relatedEx: Compares homeworkand exam averages for myMA 201 students from lastsemester
19Reading Graphs1. Reading the data: lift facts directly from the graphs2. Reading between the data: use mathematical concepts and skills to compare or combine quantities and identify relationships between data3. Reading beyond the data: predict or infer from the data
20Section 15.3: The Center of Data: Mean, Median, and Mode
21Motivating ExampleEx: If your 4th grade math class had the following scores on their adding fractions quiz, what score did the “average” student get?7, 4, 9, 10, 7, 9, 9, 6A single-number summary of a set of numerical data is called a measure of center. Ex’s are mean, median, and mode.
22The MeanDef: The mean (or arithmetic mean or average) of a list of numbers is calculated by adding all of the numbers and dividing that sum by the length of the list (the number of numbers).Ex 1: The mean of the quiz scores 7, 4, 9, 10, 7, 9, 9, 6 is …
23Visualizations of the Mean Consider a dot plot of your numerical data as if the axis is a seesaw. The mean is the location of the fulcrum so that the seesaw is perfectly balanced.See Activity 15K for another visual approach to finding the mean.
24The MedianDef: The median of a list of numbers is the middle number of the list once it is ordered from smallest to largest.Ex 2: The median of the quiz scores 7, 4, 9, 10, 7, 9, 9, 6 is …If the length of the list is even, then take the average of the middle two numbers. If the length is odd, no average is needed.
26The ModeDef: The mode of a list of numbers is the number that occur most frequently.Ex 3: The mode of the quiz scores 7, 4, 9, 10, 7, 9, 9, 6 is …
27Revisiting Our Motivating Example What score did the average or typical student get if the quiz scores were 7, 4, 9, 10, 7, 9, 9, 6?7.625, 8, or 9 are all reasonable answers
28Categorical DataDef: The modal category is the category listed most frequently in your categorical data.Note that the modal category is not necessarily the “favorite” category in statistical studies like the favorite on campus restaurant question. Study voting theory for other possibilities.
30Definition and TypesDef: Data distribution: numerical data displayed in a dot plot or histogram.Data distributions with a long tail that extends to the right (left) of the majority of the data is skewed to the right (left).
31Types of Data Distributions Data distributions which have two or more peaks bimodal.
32Types of Data Distributions Data distributions which have or nearly have reflectional symmetry are symmetric.
33Other statistical measures Def: The Pth percentile of a set of numerical data is the number such that P% of the data is ≤ that number.Def: The 1st quartile is the 25th percentile and the 3rd quartile is the 75% percentile.What is the 50th percentile called?The medianEx 4: What is the 1st quartile, 3rd quartile, and 90th percentile of the following quiz scores?3, 4, 5, 5, 5, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10
34Another Graphical Display Def: A box plot (box and whiskers plot) is a display of the lowest value, highest value, 1st and 3rd quartiles, and the median, as shown in the example below.Ex: Draw a box plot for the quiz scores from Ex 4:3, 4, 5, 5, 5, 6, 7, 7, 7, 7, 7, 8, 8, 9, 9, 9, 9, 10, 10, 10