Warm up The following graphs show foot sizes of gongshowhockey.com users. What shape are the distributions? Calculate the mean, median and mode for one.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Class Session #2 Numerically Summarizing Data
Measures of Dispersion
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
DENSITY CURVES and NORMAL DISTRIBUTIONS. The histogram displays the Grade equivalent vocabulary scores for 7 th graders on the Iowa Test of Basic Skills.
Descriptive Statistics
Measures of Dispersion or Measures of Variability
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Looking at data: distributions - Describing distributions with numbers
Measures of Dispersion
Describing Data Using Numerical Measures
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Statistics for Linguistics Students Michaelmas 2004 Week 1 Bettina Braun.
STA Lecture 111 STA 291 Lecture 11 Describing Quantitative Data – Measures of Central Location Examples of mean and median –Review of Chapter 5.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Warm-Up If the variance of a set of data is 12.4, what is the standard deviation? If the standard deviation of a set of data is 5.7, what is the variance?
Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7,
Review Measures of central tendency
1 MATB344 Applied Statistics Chapter 2 Describing Data with Numerical Measures.
Graphical Displays of Information 3.1 – Tools for Analyzing Data Learning Goal: Identify the shape of a histogram MSIP / Home Learning: p. 146 #1, 2, 4,
Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Statistical Measures. Measures of Central Tendency O Sometimes it is convenient to have one number that describes a set of data. This number is called.
INVESTIGATION 1.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
MDM4U Chapter 3 Review Normal Distribution Mr. Lieff.
Measures of Central Tendency Chapter 3.2 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Numerical Measures of Variability
Measures of Spread Chapter 3.3 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U.
Copyright © 2012 Pearson Education, Inc. All rights reserved Chapter 9 Statistics.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
Graphical Displays of Information
LIS 570 Summarising and presenting data - Univariate analysis.
Statistics and Data Analysis
© 2012 W.H. Freeman and Company Lecture 2 – Aug 29.
Chapter 3 Review MDM 4U Mr. Lieff. 3.1 Graphical Displays be able to effectively use a histogram name and be able to interpret the various types of distributions.
MDM4U Chapter 3/5 Review Normal Distribution Mr. Lieff.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Minds on! Two students are being considered for a bursary. Sal’s marks are Val’s marks are Which student would you award the bursary.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
3.3 Measures of Spread Chapter 3 - Tools for Analyzing Data Learning goal: calculate and interpret measures of spread Due now: p. 159 #4, 5, 6, 8,
3.1 Graphical Displays Name and be able to analyze the various types of distributions Symmetric: Uniform, U-shaped, Mound-shaped Asymmetric: Left/Right-skewed.
Introduction to Statistics
Introduction to Statistics
Lesson 11.1 Normal Distributions (Day 1)
Notes 13.2 Measures of Center & Spread
Chapter 1: Exploring Data
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Chapter 5 : Describing Distributions Numerically I
Statistics Unit Test Review
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
Description of Data (Summary and Variability measures)
(12) students were asked their SAT Math scores:
Summary Statistics 9/23/2018 Summary Statistics
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Data Analysis and Statistical Software I Quarter: Spring 2003
Chapter 1: Exploring Data
10-5 The normal distribution
Summary (Week 1) Categorical vs. Quantitative Variables
Chapter 1: Exploring Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Chapter 1: Exploring Data
Advanced Algebra Unit 1 Vocabulary
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Warm up The following graphs show foot sizes of gongshowhockey.com users. What shape are the distributions? Calculate the mean, median and mode for one

Measures of Spread Chapter 3.3 – Tools for Analyzing Data I can: calculate and interpret measures of spread MSIP/Home Learning: p. 168 #2b, 3b, 4, 6, 7, 10

What is spread? spread tells you how widely the data are dispersed The histograms have identical mean and median, but the spread is different

Why worry about spread? spread indicates how close the values cluster around the middle value  less spread means you have greater confidence that values will fall within a particular range.

Vocabulary spread and dispersion refer to the same thing 1) range = max - min a quartile is one of three numerical values that divide a group of numbers into 4 equal parts 2) the Interquartile Range (IQR) is the difference between the first and third quartiles  IQR = Q3 – Q1

Quartiles Example range = 55 – 26 = 29 Q2 = 41Median Q1 = 36Median of lower half of data Q3 = 46Median of upper half of data IQR = Q3 – Q1 = 46 – 36 = 10 (contains 50% of data) if a quartile occurs between 2 values, it is calculated as the average of the two values

Quartiles Example range = 55 – 26 = 29 Q2 = 40.5Median Q1 = 36Median of lower half of data Q3 = 46Median of upper half of data IQR = Q3 – Q1 = 46 – 36 = (contains 50% of data)

A More Useful Measure of Spread Range is a very basic measure of spread. Interquartile range is a somewhat useful measure of spread. Standard deviation is more useful. To calculate it we need to find the mean and the deviation for each data point Mean is easy, as we have done that before Deviation is the difference between a particular point and the mean

Deviation The mean of these numbers is 48 Deviation = (data) – (mean) The deviation for 24 is = The deviation for 84 is = 36

Standard Deviation deviation is the distance from the piece of data you are examining to the mean variance is a measure of spread found by averaging the squares of the deviation calculated for each piece of data Taking the square root of variance, you get standard deviation Standard deviation is a very important and useful measure of spread

Example of Standard Deviation mean = ( ) / 4 = 31 σ² = (26–31)² + (28-31)² + (34-31)² + (36-31)² 4 σ² = σ² = 17 σ = √17 = 4.1

Measure of Spread - Recap Measures of Spread are numbers indicating how spread out / consistent data is Smaller measure of spread = more consistent data 1) Range = (max) – (min) 2) Interquartile Range: IQR = Q3 – Q1 where  Q1 = first half median  Q3 = second half median 3) Standard Deviation  Find mean (average)  Find deviations (data – mean)  Square all, average them - this is variance (#4) or σ 2  Take the square root to get std. dev. σ

Standard Deviation σ² (lower case sigma squared) is used to represent variance σ is used to represent standard deviation σ is commonly used to measure the spread of data, with larger values of σ indicating greater spread we are using a population standard deviation

Standard Deviation with Grouped Data grouped mean = (2×2 + 3×6 + 4×6 + 5×2) / 16 = 3.5 deviations:  2: 2 – 3.5 = -1.5  3: 3 – 3.5 = -0.5  4: 4 – 3.5 = 0.5  5: 5 – 3.5 = 1.5 σ² = 2(-1.5)² + 6(-0.5)² + 6(0.5)² + 2(1.5)² 16 σ² = σ = √ = 0.9 Hours of TV 2345 Frequency2662

MSIP / Homework read through the examples on pages Complete p. 168 #2b, 3b, 4, 6, 7, 10 you are responsible for knowing how to do simple examples by hand (~6 pieces of data) we will use technology (Fathom/Excel) to calculate larger examples have a look at your calculator and see if you have this feature (Σσn and Σσn-1)

Normal Distribution Chapter 3.4 – Tools for Analyzing Data Learning goal: Determine the % of data within intervals of a Normal Distribution MSIP / Home Learning: p. 176 #1, 3b, 6, 8-10

Histograms Histograms may be skewed... Right-skewed Left-skewed

Histograms... or symmetrical

Normal? A normal distribution creates a histogram that is symmetrical and has a bell shape, and is used quite a bit in statistical analyses Also called a Gaussian Distribution It is symmetrical with equal mean, median and mode that fall on the line of symmetry of the curve

A Real Example the heights of 600 randomly chosen Canadian students from the “Census at School” data set the data approximates a normal distribution

The % Rule area under curve is 1 (i.e. it represents 100% of the population surveyed) approx 68% of the data falls within 1 standard deviation of the mean approx 95% of the data falls within 2 standard deviations of the mean approx 99.7% of the data falls within 3 standard deviations of the mean

Distribution of Data 34% 13.5% 2.35% 68% 95% 99.7% xx + 1σx + 2σx + 3σx - 1σx - 2σx - 3σ 0.15%

Normal Distribution Notation The notation above is used to describe the Normal distribution where x is the mean and σ² is the variance (square of the standard deviation) e.g. X~N (70,8 2 ) describes a Normal distribution with mean 70 and standard deviation 8 (our class at midterm?)

An example Suppose the time before burnout for an LED averages 120 months with a standard deviation of 10 months and is approximately Normally distributed. What is the length of time a user might expect an LED to last with 68% confidence? With 95% confidence? So X~N(120,10 2 )

An example cont’d 68% of the data will be within 1 standard deviation of the mean This will mean that 68% of the bulbs will be between 120–10 months and So 68% of the bulbs will last months 95% of the data will be within 2 standard deviations of the mean This will mean that 95% of the bulbs will be between 120 – 2×10 months and ×10 So 95% of the bulbs will last months

Example continued… Suppose you wanted to know how long 99.7% of the bulbs will last? This is the area covering 3 standard deviations on either side of the mean This will mean that 99.7% of the bulbs will be between 120 – 3×10 months and ×10 So 99.7% of the bulbs will last months This assumes that all the bulbs are produced to the same standard

Example continued… 34% 13.5% 2.35% 95% 99.7% months

Percentage of data between two values The area under any normal curve is 1 The percent of data that lies between two values in a normal distribution is equivalent to the area under the normal curve between these values See examples 2 and 3 on page 175

Why is the Normal distribution so important? Many psychological and educational variables are distributed approximately normally:  height, reading ability, memory, IQ, etc. Normal distributions are statistically easy to work with  All kinds of statistical tests are based on it Lane (2003)

Exercises Complete p. 176 #1, 3b, 6,

References Lane, D. (2003). What's so important about the normal distribution? Retrieved October 5, 2004 from bution.html Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from