AP Statistics Overview

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Describing Quantitative Variables
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Histograms & Stemplots for Quantitative Data. Describing Data using Summary Features of Quantitative Variables Center — Location in middle of all data.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
1 Chapter 1: Sampling and Descriptive Statistics.
Chapter 1 & 3.
Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
AP STATISTICS Section 1.1: Displaying Distributions.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 1 Exploring Data
Chapter 4 Displaying Quantitative Data. Graphs for Quantitative Data.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Chapter 1 – Exploring Data YMS Displaying Distributions with Graphs xii-7.
Descriptive Statistics
1.1 Displaying Distributions with Graphs
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
INTRODUCTORY STATISTICS Chapter 2 DESCRIPTIVE STATISTICS PowerPoint Image Slideshow.
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
AP Stats Chapter 1 Review. Q1: The midpoint of the data MeanMedianMode.
Categorical vs. Quantitative…
Unit 4 Statistical Analysis Data Representations.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
To be given to you next time: Short Project, What do students drive? AP Problems.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: One Quantitative Variable SECTIONS 2.2, 2.3 One quantitative.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
Histograms & Stemplots for Quantitative Data Describing Data using Summary Features of Quantitative Variables Center — Location in middle of all data.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
1 Never let time idle away aimlessly.. 2 Chapters 1, 2: Turning Data into Information Types of data Displaying distributions Describing distributions.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
1 Take a challenge with time; never let time idles away aimlessly.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
What is Statistics?. Statistics 4 Working with data 4 Collecting, analyzing, drawing conclusions.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Unit 1 - Graphs and Distributions. Statistics 4 the science of collecting, analyzing, and drawing conclusions from data.
Section 2.1 Visualizing Distributions: Shape, Center, and Spread.
UNIT ONE REVIEW Exploring Data.
1st Semester Final Review Day 1: Exploratory Data Analysis
MATH 2311 Section 1.5.
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
Laugh, and the world laughs with you. Weep and you weep alone
Displaying Distributions with Graphs
AP Exam Review Chapters 1-10
Drill {A, B, B, C, C, E, C, C, C, B, A, A, E, E, D, D, A, B, B, C}
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
Displaying and Summarizing Quantitative Data
Organizing Data AP Stats Chapter 1.
Dotplots Horizontal axis with either quanitative scale or categories
Means & Medians.
What is Statistics? Day 2..
Chapter 1 Warm Up .
Welcome!.
Honors Statistics Review Chapters 4 - 5
CHAPTER 1 Exploring Data
Types of variables. Types of variables Categorical variables or qualitative identifies basic differentiating characteristics of the population.
MATH 2311 Section 1.5.
Presentation transcript:

AP Statistics Overview Text: Mind On Statistics, by Jessica Utts and Robert Heckard Pg. 1—Stress definition of Statistics.

What is Statistics? Statistics is the science of learning from data. Ex. Take a sample of 50 seniors and record the number of AP classes they are taking. Use this to make a prediction, or educated guess, about how many AP classes ALL seniors are taking. Parameter – summary measurement (ex: p, µ) that describes the population Statistic – summary measurement (ex: 𝒑 , 𝒙 ) that describes the sample Thanks to Texas A&M University at College Station, TX for giving me a wonderful opportunity to advance my teaching of Statistics. A special thanks to Dr. Jim Matis and Dr. Julie H. Carroll for their inspiration and dedication to improving the field of teaching statistics at the undergraduate level. Ask students why would they want to learn statistics. Besides the requirement for graduation…ask them if they ever read the sports summary statistics after games, watch the analysts predict stock market movements, watched the weather news forecast tomorrow’s climatic changes,…ALL of which involves statistics. Have they ever participated in a survey or experiment? I.E. phone surveys, internet surveys, medical experiments So what is Statistics? See if any volunteers will attempt to answer.

AP Statistics – At a Glance Exploring Data (Chapters 1 – 4) Create Distributions (graph of data) Describe / Compare Distributions Observational Studies and Experiments (Ch 5) Anticipating Patterns (Chapters 6 – 9) Statistical Inference (Chapters 10 – 15)

The key to AP Stats: THINK—SHOW—TELL Think first! Know where you’re headed and why. It will save you a lot of work. Show is what most people think Statistics is about. The mechanics of calculating statistics and making displays is important, but not the most important part of Statistics. Tell what you’ve learned. Until you’ve explained your results so that someone else can understand your conclusions, the job is not done. Text: Mind On Statistics, by Jessica Utts and Robert Heckard Humpty Dumpty sat on the wall, Humpty Dumpty had a great fall. All the king’s horses And all the king’s men Couldn’t put Humpty Dumpty Together again. We could all make a moral for this story such as Stay focused or The higher you get the greater the fall. Ch 1 contains 7 case studies that will be referred to continuously in the textbook. Read the moral first then the case study. STAY FOCUSED!

WHO is being described? How many? Individuals are the objects described by a set of data. These individuals go by different names depending on the situation. Respondents Individuals who answer a survey. Subjects/ Participants People on who we experiment. Experimental Units Animals, plants, Web sites, and other inanimate subjects on which we experiment. Keep this simple. May want to discuss Discrete vs. Continuous quantitative variables.

WHAT are the variables? Units? Variables – characteristics recorded about each individual Categorical Group of category names w/no order Eye Color (brown, blue, green) Quantitative Numerical values Weight (117lbs 170oz) Univariate Data One Variable Final Exam Scores Bivariate Data 2 Paired Variables Homework % vs. Final Exam Scores Keep this simple. May want to discuss Discrete vs. Continuous quantitative variables. Discrete Numbers have specific values # of desks, money Continuous Estimated numbers Time, height, age

CHAPTER 1 Exploring Data Text: Mind On Statistics, by Jessica Utts and Robert Heckard Pg. 1—Stress definition of Statistics.

Summarize Categorical Data using a Bar Chart or Pie Chart AP Scores 30 28 26 24 22 20 18 16 14 12 10 8 6 4 2 FREQUENCY 1 2 3 4 5 AP SCORES

Dotplot for Univariate Quantitative Data

Stemplot for Quantitative Data Ages of Death of U.S. First Ladies 3 | 4, 6 4 | 3 5 | 2, 4, 5, 7, 8 6 | 0, 0, 1, 2, 4, 4, 4, 5, 6, 9 7 | 0, 1, 3, 4, 6, 7, 8, 8 8 | 1, 1, 2, 3, 3, 6, 7, 8, 9, 9 9 | 7 3 | 4 indicates 34 years old Stem Leaf Leaf – single digit Do not skip stems Leafs – smallest to largest Leaf must be a single digit. Do not skip stems. Leafs in order from smallest to largest.

Split Stemplot 1 | 7 1 | 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1 2 | 2, 2, 2, 3, 3 2 | 4, 5 2 | 2 | 8 3 | 0, 1 Stem is split for every 2 leaves— (0, 1), (2, 3), (4, 5), (6, 7), and (8, 9) Age of 27 students randomly selected from Stat 303 at A&M

Split Stemplot 1 | 1 | 7, 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4 2 | 5, 8 3 | 0, 1 3 | Stem is split for every 5 leaves—(0 thru 4) AND ( 5 thru 9) Age of 27 students randomly selected from Stat 303 at A&M

Back-to-back Stemplot Babe Ruth Roger Maris | 0 | 8 | 1 | 3, 4, 6 5, 2 | 2 | 3, 6, 8 5, 4 | 3 | 3, 9 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 0 | 6 | 1 Number of home runs in a season

Cumulative Relative Frequency Frequency - # of times something occurs Cumulative Frequency – keep adding Relative Frequency – percents Cumulative Relative Frequency – add percents (AKA ogive) See graphs on page 62 Letter Grade Frequency Cumulative Frequency Relative Frequency Cumulative Relative Frequency A B C D F

Histogram—Univariate Quantitative data Frequency Count Classes should be equal width Reasonable width Reasonable starting point Roughly 7 bars Bars should touch This is not a bar graph! Univariate Variable Age Histogram is used to graphically display univariate quantitative data as the example shows. Smaller data sets can be sketched by hand with 5 to 7 equal width intervals. (Note: In Stat 303, we will be using the computer to generate graphs.) The vertical axis represent count (frequency) or it could represent percent (relative frequency).

Histograms Discrete vs. Continuous

Location—pth Percentile The pth percentile of a distribution (set of data) is the value such that p percent of the observations fall at or below it. Suppose your Math SAT score is at the 80th percentile of all Math SAT scores. This means your score was higher than 80% of all other test takers.

5 Number Summary Minimum, Q1, Median, Q3, Maximum Q1 (Quartile 1) is the 25th percentile of ordered data or median of lower half of ordered data Median (Q2) is 50th percentile of ordered data Q3 (Quartile 3) is the 75th percentile of ordered data or median of upper half of ordered data Range = Maximum – minimum IQR = Interquartile Range (Q3 – Q1) middle 50% Percentiles concept is implied but should be stressed to students that there exists other percentiles such as 56th percentile or the 98th percentile and what do these mean. Kth percentile means that k% of the ordered data values are at or below that data value. For example, if the median is 100, then 50% of the ordered data values falls at or below 100. Also, (100-k)% represents the amount of ordered data that falls above the percentile data value. Outliers found by using the formula above creates an interval that if any data value falls outside that interval it is considered an outlier. We use this often combined with boxplots.

IQR(Interquartile Range) = Q3 – Q1 Calculating OUTLIERS “1.5IQR above Q3 or below Q1” IQR(Interquartile Range) = Q3 – Q1 Any point that falls outside the interval calculated by Q1- 1.5(IQR) and Q3 + 1.5(IQR) is considered an outlier. Percentiles concept is implied but should be stressed to students that there exists other percentiles such as 56th percentile or the 98th percentile and what do these mean. Kth percentile means that k% of the ordered data values are at or below that data value. For example, if the median is 100, then 50% of the ordered data values falls at or below 100. Also, (100-k)% represents the amount of ordered data that falls above the percentile data value. Outliers found by using the formula above creates an interval that if any data value falls outside that interval it is considered an outlier. We use this often combined with boxplots.

Calculate the 5 Number Summary 121, 132, 134, 154, 164, 175, 188, 192, 201, 203, 203 3, 4, 4, 5, 10, 12, 13, 24 Calculate the 5 Number Summary and Check for Outliers

Boxplot - Using 5 Number Summary 5# Summary of Computers: 250, 1000, 2950, 5400, 8600 1000 2950 5400 250 8600 Data from The Presence of Computers in American Schools, by Ronald E. Anderson and Amy Ronnkvist Teaching, Learning, and Computing: 1998 Survey, Report #2, Center for Research on Information Technology and Organizations, The University of California, Irvine and The University of Minnesota. Stress 25% of the ordered data falls within the interval from min to Q1, as 25% of the ordered data set falls within the interval from Q1 and median, as this continues 25% of the ordered data falls within the interval from median to Q3, and the final 25% of the ordered data set falls within the interval from Q3 to max. Although the spreads appear different in length, the amount of data is the same within each interval. The difference in spread indicates a difference in data variation within each interval (not amount of data). Remember the IQR contains the middle 50% of the ordered data set. Q3 Max min Q1 median

Boxplot and Modified Boxplot Modified – show outliers 25% of data in each section

Comparative Parallel (Side by Side) Boxplots Outliers Boxplot can be used to graphically display univariate data. As in the example here, a quick comparison can be made by separating the data by the categorical variable, gender. The five number summary (minimum, quartile 1, median, quartile 3, and maximum) are the breaking points in boxplot. If outliers exist, then these points are not included in the modified boxplot.

Mean or Median?

Robust (Resistant) Statistic Median is resistant to extreme values (outliers) in data set. Mean is NOT robust against extreme values. Mean is pulled away from the center of the distribution toward the extreme value (“tails of graph”).

Of the 2 segments, where is the Mean with respect to the Median? Remember the mean is pulled toward extreme values.

Where’s the Mean with respect to the Median?

Describing Spread: Standard Deviation Roughly speaking, standard deviation is the average distance values fall from the mean (center of graph). Let the arrow mark the center, mean. Each ring measures an average distance from the center, mean. Stress the the rings are of equal width (standard).

Population and Sample Standard Deviation 2 population variance s2 sample variance Be sure to go back over what each letter stands for in both formulas. Remind students what operation is performed by the summation symbol AND that a calculator or computer software will calculate these for them. Variance is another measure of spread and is calculated by squaring the standard deviation value. Students may ask why there is a difference dividing by n instead of n-1 for their respective formulas. Later in the course it hopefully will become clearer. What is Variance???

Variance = (Standard deviation)2 What is Variance? Variance = (Standard deviation)2

Calculated Standard Deviation is a measure of Variation in data Sample Data Set Mean Standard Deviation 100, 100, 100, 100, 100 100 90, 90, 100, 110, 110 10 30, 90, 100, 110, 170 50 90, 90, 100, 110, 320 142 99.85 The first data set contained all the same value, so the mean is obvious and hopefully the standard deviation value is too. Since there is no variation in the data set, the standard deviation is zero. The second data set has a simple mean to calculate (mentally) but the standard deviation can be calculated using the formula on a side board. The third example has data values that are more spread out therefore the standard deviation value should be higher. The fourth example contains an outlier and dramatically affects the mean and standard deviation.

LET’S CUSS! Thanks to Texas A&M University at College Station, TX for giving me a wonderful opportunity to advance my teaching of Statistics. A special thanks to Dr. Jim Matis and Dr. Julie H. Carroll for their inspiration and dedication to improving the field of teaching statistics at the undergraduate level. Ask students why would they want to learn statistics. Besides the requirement for graduation…ask them if they ever read the sports summary statistics after games, watch the analysts predict stock market movements, watched the weather news forecast tomorrow’s climatic changes,…ALL of which involves statistics. Have they ever participated in a survey or experiment? I.E. phone surveys, internet surveys, medical experiments So what is Statistics? See if any volunteers will attempt to answer.

Center Unusual Features Spread Shape To describe a distribution: LET’S CUSS! Center Unusual Features Spread Shape Thanks to Texas A&M University at College Station, TX for giving me a wonderful opportunity to advance my teaching of Statistics. A special thanks to Dr. Jim Matis and Dr. Julie H. Carroll for their inspiration and dedication to improving the field of teaching statistics at the undergraduate level. Ask students why would they want to learn statistics. Besides the requirement for graduation…ask them if they ever read the sports summary statistics after games, watch the analysts predict stock market movements, watched the weather news forecast tomorrow’s climatic changes,…ALL of which involves statistics. Have they ever participated in a survey or experiment? I.E. phone surveys, internet surveys, medical experiments So what is Statistics? See if any volunteers will attempt to answer.

Center Unusual Features Spread Shape Mean, Median Unusual Features Gaps, Outliers, Clusters Spread Standard Deviation, Range, IQR Shape Normal, Symmetric, Skewed Right (left) CSS—Center, Spread, Shape. You need to be able to eyeball this information from a graph. Dotplot has 500 temperatures recorded at the Southpole for 379 months. Approximately where is the center? Median –54.6 or Mean –49.4 Approximately how spread out is the data? Overall from –69 to –21 or 48 degree variability Approximately what shape does the data show? Trimodal representing the “3” seasons (short summer, short fall & spring, and long winter) Any outliers? Potentially but not positive without further investigation.

CENTER Mean(, ) —add up data values and divide by number of data values Median—list data values in order, locate middle data value Data Set: 19, 20, 20, 21, 22 Mean and median of a data set may or may NOT be one of the values in the data set. Inform students that mu symbol represents the population mean and x bar represents the sample mean. Remind students that the data must be ranked prior to finding median by hand. Ask them to refer to their textbook for step by step instructions. Mean is Median is 20 since it is the middle number of the ranked (ordered) data values.

Cluster---Gaps---Potential Outliers UNUSUAL FEATURES Cluster---Gaps---Potential Outliers

skewed left or symmetric or uniform. SHAPE “Tail” points to right Skewed Right Normal – bell-shaped The shape can also be skewed left or symmetric or uniform.

SPREAD The spread can be described using: Standard Deviation (about 10) or Range (80 – 150 or 70) or IQR (about 100 – 130)

Summary Features of Quantitative Variables Center – Location Unusual Features – Outliers, Gaps, Clusters Spread – Variability Shape – Distribution Pattern CSS—Center, Spread, Shape. You need to be able to eyeball this information from a graph. Dotplot has 500 temperatures recorded at the Southpole for 379 months. Approximately where is the center? Median –54.6 or Mean –49.4 Approximately how spread out is the data? Overall from –69 to –21 or 48 degree variability Approximately what shape does the data show? Trimodal representing the “3” seasons (short summer, short fall & spring, and long winter) Any outliers? Potentially but not positive without further investigation.

How to Choose Measures of Center and Spread? NON - SKEWED DISTRIBUTIONS – use mean and standard deviation SKEWED DISTRIBUTIONS – use 5# Summary

Comparing Distributions CUSS COMPARE in CONTEXT GENERAL CONCLUSION

Linear Transformations using the height of all LHS Seniors (in inches) What happens to center and spread if everyone is put in 3 inch heels (add 3 inches)? What happens to the center and spread if we change everyone height to feet (divide by 12)?

Summary of Linear Transformations Multiplying each observation by a positive number b multiplies both measures of center (mean and median) and measures of spread (IQR and standard deviation) by b. Adding the same number a (either positive, negative, or zero) to each observation adds a to measures of center and to quartiles but does not change measures of spread. NOTE: The shape NEVER changes!