Lesson 1 - R Summary to Exploring Data. Objectives Use a variety of graphical techniques to display a distribution. These should include bar graphs, pie.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Click the mouse button or press the Space Bar to display the answers.
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
The Five-Number Summary and Boxplots
Unit 4: Describing Data.
Chapter 1 Introduction Individual: objects described by a set of data (people, animals, or things) Variable: Characteristic of an individual. It can take.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
AP Statistics Overview
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 1 Exploring Data
1.1 Displaying Distributions with Graphs
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
Have out your calculator and your notes! The four C’s: Clear, Concise, Complete, Context.
What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Describing distributions with numbers
Lesson Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website.
Categorical vs. Quantitative…
Chap 3-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 3 Describing Data Using Numerical.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Bellwork 1. If a distribution is skewed to the right, which of the following is true? a) the mean must be less than the.
Statistics Chapter 1: Exploring Data. 1.1 Displaying Distributions with Graphs Individuals Objects that are described by a set of data Variables Any characteristic.
Exploring Data 1.2 Describing Distributions with Numbers YMS3e AP Stats at LSHS Mr. Molesky 1.2 Describing Distributions with Numbers YMS3e AP Stats at.
Measures of Center vs Measures of Spread
To be given to you next time: Short Project, What do students drive? AP Problems.
MMSI – SATURDAY SESSION with Mr. Flynn. Describing patterns and departures from patterns (20%–30% of exam) Exploratory analysis of data makes use of graphical.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.
Chapter 1: Exploring Data, cont. 1.2 Describing Distributions with Numbers Measuring Center: The Mean Most common measure of center Arithmetic average,
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Midterm Review IN CLASS. Chapter 1: The Art and Science of Data 1.Recognize individuals and variables in a statistical study. 2.Distinguish between categorical.
Descriptive Statistics ( )
Describing Distributions with Numbers
1.3 Measuring Center & Spread, The Five Number Summary & Boxplots
Chapter 2: Describing Location in a Distribution
Warm Up.
Summary to Exploring Data
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
U4D3 Warmup: Find the mean (rounded to the nearest tenth) and median for the following data: 73, 50, 72, 70, 70, 84, 85, 89, 89, 70, 73, 70, 72, 74 Mean:
Description of Data (Summary and Variability measures)
The Practice of Statistics, Fourth Edition.
Laugh, and the world laughs with you. Weep and you weep alone
Ninth grade students in an English class were surveyed to find out about how many times during the last year they saw a movie in a theater. The results.
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Please take out Sec HW It is worth 20 points (2 pts
1.2 Describing Distributions with Numbers
Topic 5: Exploring Quantitative data
Describing Distributions with Numbers
Quartile Measures DCOVA
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
Click the mouse button or press the Space Bar to display the answers.
Common Core Math I Unit 2: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Organizing Data AP Stats Chapter 1.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
Exploratory Data Analysis
Honors Statistics Review Chapters 4 - 5
Describing Data Coordinate Algebra.
Presentation transcript:

Lesson 1 - R Summary to Exploring Data

Objectives Use a variety of graphical techniques to display a distribution. These should include bar graphs, pie charts, stemplots, histograms, ogives, time plots, and Boxplots Interpret graphical displays in terms of the shape, center, and spread of the distribution, as well as gaps and outliers Use a variety of numerical techniques to describe a distribution. These should include mean, median, quartiles, five-number summary, interquartile range, standard deviation, range, and variance

Objectives Interpret numerical measures in the context of the situation in which they occur Learn to identify outliers in a data set Explore the effects of a linear transformation of a data set

Vocabulary none new

Statistical Plots Stemplot –stem and leaf from Algebra –remember back-to-back for comparisons Boxplot –know how to use (will use it a lot in course) Histogram Dotplot Normality Plot (will learn later) Pie Chart Bar Graph

Describing Distributions Shape –symmetric, skewed (left or right), multi-modal Outliers –do they exist, how many, and on which ends Center –appropriate measure (mean, median, or mode) Spread –appropriate measure (standard deviation or IQR)

Measures of Center and Spread MeasureResistantWhen to UseOutlier Effects Center MeanNosymmetricPulls toward outlier MedianYesskewnone ModeYescategoricalnone Spread Standard DeviationNosymmetricIncreases IQRYesSkewnone RangeNoavoidIncreases Plot your data Dotplot, Stemplot, Histogram Interpret what you see: Shape, Outliers, Center, Spread Choose numerical summary: x and s, or Five-Number Summary

Numerical Statistical Summaries 5 Number Summary from 1-VarStats –Min –Q1 (25 th percentile of the dataset) –Q2 (Median, 50 th percentile of the dataset) –Q3 (75 th percentile of the dataset) –Max IQR = Q3 – Q1 Outliers  values –less than Q  IQR –more than Q  IQR Mean and Standard Deviation from 1-VarStats

TI-83 Help Use Lists to keep track of data for other work 1 Var Stats (mean, standard deviation, 5 number summary) Stat Plot (Box plots, histogram, dot plot) –ZoomStat Comparative Plots (turn plot1 and plot2 on)

Data Analysis Toolbox To answer a statistical question of interest: Data: Organize and Examine (W 5 HW) Who are the individuals described? What are the variables? Why were the data gathered? When, Where, How, and By Whom were data gathered? Graph: Construct an appropriate graphical display Comparative Graphs (boxplots, stemplots, histograms) Describe SOCS Numerical Summary: Appropriate center & spread Calculate Mean and Standard Deviation Calculate 5 number summary Interpretation: Answer question in context!

Summary and Homework Summary –Data Analysis is the art of describing data in context using graphs and numerical summaries. –The purpose is to describe the most important features of a dataset (SOCS) Homework –pg

Problem 1 The upper or third quartile for grades on the first calculus test was 85%. Your friend, who has not taken statistics, scored 90% on the test. Explain to your friend how her grade compares to others in her class. Since the 3 rd quartile (75% ranking) was 85%, her grade of 90% is better than at least 75% of the class.

Problem 2 Suppose you have test scores of 72%, 91%, 86%, and 95% in your chemistry class. What score do you need to make on the next test in order to have an 85% average? 5  85 = = – 344 = 81

Problem 3 In the computational formula for standard deviation, you sometimes use n and sometimes use (n – 1). Under what circumstances should you use n? We use n-1 for sample standard deviation because we lose one degree of freedom for the estimate of the population mean with the sample mean. If we have the entire population (a census), then our sample mean is the population mean and we can divide by n in calculating the standard deviation.

Problem 4 (a)We studied two measures of central tendency, mean and median. Which of these is the more resistant measure? _________________ Explain why this measure is more resistant. (b)We studied three measures of spread: standard deviation, interquartile range, and range. Which of these is the most resistant measure? ________________ IQR median because they are least affected by outliers

Problem 5 In an experiment designed to determine the effect of a drug on reaction time, a subject is asked to press a button whenever a light flashes. The reaction times (in milliseconds) for ten trials are: (a)Make a stem and leaf plot to display this information. Be sure to include unit information (a legend). (b)What information about the distribution does the stem and leaf plot provide? Be thorough in your response. Reaction Time 9 | | | 2 12 | 13 | 8 milliseconds skewed right, median=99.5, IQR is 12, 138 is an outlier

Problem 6 Data were collected on a sample of Deerfield Academy students. Several of the variables are listed below. Next to each variable, put all of the following words that correctly describe the variable: Categorical quantitativediscrete continuous (a) Advisor ______________________________ (b) Height _______________________________ (c) Number of courses student is taking this term ______________________________________ categorical quantitative continuous quantitative discrete

Problem 7 A teacher returned the first test to the five students in a small class. She reported that the median score was 85 and the mean score was 84. The student with the lowest score (62) realized that the teacher had incorrectly calculated her grade and that the correct grade was 72. Assuming that this is still be the lowest score for the seminar students, when the teacher recomputed the summary statistics, the median will equal _____________ and the mean will equal ________________. 85 median doesn’t change because order is unaffected by rescoring = 86 mean is recalculated by dividing 10 additional points by 5 = 2 and adding 2 points to the mean

Problem 8 The histogram below displays weight increases (in pounds) for a sample of pigs fed a certain diet. Assume that bars include right endpoints. (a)How many pigs were in this sample? ___________ (b)Estimate the median weight increase for the pigs in this sample. __________ (c)What proportion of these pigs had a weight increase exceeding 20 pounds? _________________ (d)Briefly (but completely) describe the shape of this distribution = th ranked – lb 5/23 = 21.74% unimodal skewed right

Problem 9 As I drove through Connecticut several weeks ago, I obtained a sample of prices for a gallon of unleaded gasoline at service stations I passed. Four of these are provided here: $3.09, $3.15, $3.19, $3.29. Use the definition and show work below to find the mean and standard deviation of these prices. Round answers to the nearest cent. (a)Mean (b)Standard deviation 1/n ∑x i ¼  ( ) = 3.18 Var = 1/(n-1)∑(x i - mean)² ⅓  [( )² + ( )² + ( )² + ( )² ] ⅓  [(-.09)² + (-.03)² + (.01)² + (.11)² ] = ⅓ .0212 = Std dev = √Var = √ =

Problem 10 The Los Angeles Times reported interest rates for savings accounts at a sample of California banks. Summary statistics are provided below: Minimum = 3.15%Q1 = 3.25%Median = 3.31% Q3 = 3.33%Maximum = 4.35% Determine whether the data set has any outliers (check for extremely low and high values). Show work and provide an explanation to support your answer. LF = Q1 – 1.5  IQR = 3.25 – 1.5  0.08 = 3.13% Since the max is greater than UF, the data has at least one outlier. IQR = Q3 – Q1 = 0.08% UF = Q  IQR = 3.33 – 1.5  0.08 = 3.45%