Describing Data: One Quantitative Variable

Slides:



Advertisements
Similar presentations
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Advertisements

Exploratory Data Analysis I
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Describing Data: One Variable
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 9/6/12 Describing Data: One Variable SECTIONS 2.1, 2.2, 2.3, 2.4 One categorical.
Chapter 1 & 3.
Stat 512 Day 4: Quantitative Data. Last Time p-values and statistical significance  What p-values tell us (and do not tell us) For now, approximating.
Source: Behavioral Risk Factor Surveillance System, CDC. Obesity Trends* Among U.S. Adults BRFSS, 1985 No Data
Introductory Statistics: Exploring the World through Data, 1e
CHAPTER 3: The Normal Distributions Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
CHAPTER 1: Picturing Distributions with Graphs
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
MAT 1000 Mathematics in Today's World. Last Time 1.Three keys to summarize a collection of data: shape, center, spread. 2.The distribution of a data set:
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
 Multiple choice questions…grab handout!. Data Analysis: Displaying Quantitative Data.
STAT 250 Dr. Kari Lock Morgan
Confidence Intervals I 2/1/12 Correlation (continued) Population parameter versus sample statistic Uncertainty in estimates Sampling distribution Confidence.
Chapter 3 Statistics for Describing, Exploring, and Comparing Data
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Created by Tom Wegleitner, Centreville, Virginia Section 3-1 Review and.
DAY 3 14 Jan Today is A.January 14, 2014 B.January 13, 2013.
Estimation: Sampling Distribution
Review of Chapters 1- 5 We review some important themes from the first 5 chapters 1.Introduction Statistics- Set of methods for collecting/analyzing data.
Stat 1510: Statistical Thinking and Concepts 1 Density Curves and Normal Distribution.
1 Review Descriptive Statistics –Qualitative (Graphical) –Quantitative (Graphical) –Summation Notation –Qualitative (Numerical) Central Measures (mean,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Turning Data Into Information Chapter 2.
CHAPTER 3: The Normal Distributions ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 12, Part 2 STA 291 Summer I Mean and Standard Deviation The five-number summary is not the most common way to describe a distribution numerically.
Density Curves and the Normal Distribution.
1 Chapter 3 Looking at Data: Distributions Introduction 3.1 Displaying Distributions with Graphs Chapter Three Looking At Data: Distributions.
1 Review Sections Descriptive Statistics –Qualitative (Graphical) –Quantitative (Graphical) –Summation Notation –Qualitative (Numerical) Central.
Categorical vs. Quantitative…
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Dr. Serhat Eren Other Uses for Bar Charts Bar charts are used to display data for different categories where the data are some kind of quantitative.
To be given to you next time: Short Project, What do students drive? AP Problems.
Agresti/Franklin Statistics, 1 of 63 Chapter 2 Exploring Data with Graphs and Numerical Summaries Learn …. The Different Types of Data The Use of Graphs.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 Review Sections 2.1, 2.2, 1.3, 1.4, 1.5, 1.6 in text.
Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Describing Data: One Quantitative Variable SECTIONS 2.2, 2.3 One quantitative.
Describing Data: Two Variables
Notes Unit 1 Chapters 2-5 Univariate Data. Statistics is the science of data. A set of data includes information about individuals. This information is.
Chapter 13 Sampling distributions
Bell Ringer You will need a new bell ringer sheet – write your answers in the Monday box. 3. Airport administrators take a sample of airline baggage and.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
STAT 101: Day 5 Descriptive Statistics II 1/30/12 One Quantitative Variable (continued) Quantitative with a Categorical Variable Two Quantitative Variables.
©2011 Brooks/Cole, Cengage Learning Elementary Statistics: Looking at the Big Picture 1 Lecture 7: Chapter 4, Section 3 Quantitative Variables (Summaries,
1 Take a challenge with time; never let time idles away aimlessly.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Describing Data: Two Variables
Math 201: Chapter 2 Sections 3,4,5,6,7,9.
Chapter 1: Exploring Data
Statistics 200 Lecture #4 Thursday, September 1, 2016
CHAPTER 2 Modeling Distributions of Data
CHAPTER 1: Picturing Distributions with Graphs
Topic 5: Exploring Quantitative data
One Quantitative Variable: Measures of Spread
CHAPTER 1 Exploring Data
Means & Medians.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Welcome!.
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Sampling Distribution Models
CHAPTER 1 Exploring Data
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
CHAPTER 1 Exploring Data
Presentation transcript:

Describing Data: One Quantitative Variable STAT 250 Dr. Kari Lock Morgan Describing Data: One Quantitative Variable SECTIONS 2.2, 2.3 One quantitative variable (2.2, 2.3)

Descriptive Statistics The Big Picture Population Sampling Sample Statistical Inference Descriptive Statistics

Descriptive Statistics In order to make sense of data, we need ways to summarize and visualize it Summarizing and visualizing variables and relationships between two variables is often known as descriptive statistics (also known as exploratory data analysis) Type of summary statistics and visualization methods depend on the type of variable(s) being analyzed (categorical or quantitative) Today: One quantitative variable

Obesity Trends* Among U.S. Adults BRFSS, 1990, 2000, 2010 (*BMI 30, or about 30 lbs. overweight for 5’4” person) 1990 2000 2010 No Data <10% 10%–14% 15%–19% 20%–24% 25%–29% ≥30%

Obesity in America Obesity is a HUGE problem in America We’ll explore this with two different types of data, both collected by the CDC: Proportion of adults who are obese in each state BMI for a random sample of Americans

Behavioral Risk Factor Surveillance System http://www.cdc.gov/obesity/data/table-adults.html

Obesity by State

Dotplot In a dotplot, each case is represented by a dot and dots are stacked. Easy way to see each case Minitab: Graph -> Dotplot -> One Y -> Simple

Histogram The height of the each bar corresponds to the number of cases within that range of the variable 5 states with obesity rate between 33.25 and 33.75 Minitab: Graph -> Histogram -> Simple

Shape Long right tail Symmetric Right-Skewed Left-Skewed

National Health and Nutrition Examination Survey

BMI of Americans

BMI of Americans The distribution of BMI for American adults is Symmetric Left-skewed Right-skewed

Notation The sample size, the number of cases in the sample, is denoted by n We often let x or y stand for any variable, and x1 , x2 , …, xn represent the n values of the variable x x1 = 32.4, x2 = 28.4, x3 = 26.8, …

Mean The mean or average of the data values is 𝑚𝑒𝑎𝑛= 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒𝑠 𝑚𝑒𝑎𝑛= 𝑥 1 + 𝑥 2 +…+ 𝑥 𝑛 𝑛 = 𝑥 𝑛 Sample mean: 𝑥 Population mean:  (“mu”) Minitab: Stat -> Basic Statistics -> Display Descriptive Statistics

Mean The average BMI for Americans in this sample is 𝑥 = 24.887. The average obesity rate across the 50 states is µ = 28.606. The average BMI for Americans in this sample is 𝑥 = 24.887.

Median The median, m, is the middle value when the data are ordered. If there are an even number of values, the median is the average of the two middle values. The median splits the data in half. Minitab: Stat -> Basic Statistics -> Display Descriptive Statistics

Measures of Center For symmetric distributions, the mean and the median will be about the same For skewed distributions, the mean will be more pulled towards the direction of skewness

Mean is “pulled” in the direction of skewness Measures of Center m = 24.163 Mean is “pulled” in the direction of skewness  =24.887

Skewness and Center A distribution is left-skewed. Which measure of center would you expect to be higher? Mean Median

Outlier An outlier is an observed value that is notably distinct from the other values in a dataset.

Outliers More info here

Resistance A statistic is resistant if it is relatively unaffected by extreme values. The median is resistant while the mean is not. Mean Median With Outlier 105.22 101.0 Without Outlier 102.56 100.5

Outliers When using statistics that are not resistant to outliers, stop and think about whether the outlier is a mistake If not, you have to decide whether the outlier is part of your population of interest or not Usually, for outliers that are not a mistake, it’s best to run the analysis twice, once with the outlier(s) and once without, to see how much the outlier(s) are affecting the results

Standard Deviation The standard deviation for a quantitative variable measures the spread of the data 𝑠= 𝑥− 𝑥 2 𝑛−1 Sample standard deviation: s Population standard deviation:  (“sigma”) Minitab: Stat -> Basic Statistics -> Display Descriptive Statistics

Standard Deviation The standard deviation gives a rough estimate of the typical distance of a data values from the mean The larger the standard deviation, the more variability there is in the data and the more spread out the data are

Standard Deviation Both of these distributions are bell-shaped

95% Rule If a distribution of data is approximately symmetric and bell-shaped, about 95% of the data should fall within two standard deviations of the mean. For a population, 95% of the data will be between µ – 2 and µ + 2 For a sample, 95% of the data will be between 𝑥 −2𝑠 and 𝑥 −2𝑠

The 95% Rule

95% Rule Give an interval that will likely contain 95% of obesity rates of states.

95% Rule Could we use the same method to get an interval that will contain 95% of BMIs of American adults? Yes No

The 95% Rule The normal distribution app on statkey is a good way to demonstrate the 95% rule – ask students to pick any mean and sd, and then have them guess what the bounds for the middle 95% (can get by clicking on two tail) will be StatKey

The 95% Rule The standard deviation for hours of sleep per night is closest to ½ 1 2 4 I have no idea

To Do Read Sections 2.2 and 2.3 Do Homework 2.2 (due Friday, 2/6)