Data and Variation.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Statistical Reasoning for everyday life
Descriptive Measures MARE 250 Dr. Jason Turner.
Dot Plots & Box Plots Analyze Data.
Percentiles and the Normal Curve
Section 4.3 ~ Measures of Variation
Introduction to Summary Statistics
Numerically Summarizing Data
Statistics Intro Univariate Analysis Central Tendency Dispersion.
Measures of Dispersion
Summarizing and Displaying Measurement Data
12.3 – Measures of Dispersion
STANDARD SCORES AND THE NORMAL DISTRIBUTION
Programming in R Describing Univariate and Multivariate data.
Box Plots. Statistical Measures Measures of Central Tendency: numbers that represent the middle of the data (mean, median, mode) Mean ( x ):Arithmetic.
Enter these data into your calculator!!!
Descriptive Statistics
Data Analysis and Statistics. When you have to interpret information, follow these steps: Understand the title of the graph Read the labels Analyze pictures.
Chapter 3 (continued) Nutan S. Mishra. Exercises Size of the data set = 12 for all the five problems In 3.11 variable x 1 = monthly rent of.
Normal Distribution. Objectives The student will be able to:  identify properties of normal distribution  apply mean, standard deviation, and z -scores.
Chapter 3 Averages and Variations
Statistics Recording the results from our studies.
Review Measures of central tendency
Statistics Chapter 9. Day 1 Unusual Episode MS133 Final Exam Scores
Math I: Unit 2 - Statistics
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
BUS304 – Data Charaterization1 Other Numerical Measures  Median  Mode  Range  Percentiles  Quartiles, Interquartile range.
UTOPPS—Fall 2004 Teaching Statistics in Psychology.
Warm Up Find the mean, median, mode, range, and outliers of the following data. 11, 7, 2, 7, 6, 12, 9, 10, 8, 6, 4, 8, 8, 7, 4, 7, 8, 8, 6, 5, 9 How does.
Categorical vs. Quantitative…
1 Elementary Statistics Larson Farber Descriptive Statistics Chapter 2.
Larson/Farber Ch 2 1 Elementary Statistics Larson Farber 2 Descriptive Statistics.
INVESTIGATION 1.
Box Plots. Statistical Measures Measures of Central Tendency: numbers that represent the middle of the data (mean, median, mode) Mean ( x ):Arithmetic.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
INVESTIGATION Data Colllection Data Presentation Tabulation Diagrams Graphs Descriptive Statistics Measures of Location Measures of Dispersion Measures.
Copyright © 2012 Pearson Education, Inc. All rights reserved Chapter 9 Statistics.
Descriptive Statistics Review – Chapter 14. Data  Data – collection of numerical information  Frequency distribution – set of data with frequencies.
1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.
Chapter 6: Interpreting the Measures of Variability.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
Statistics and Data Analysis
Descriptive Statistics(Summary and Variability measures)
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
SECONDARY MATH Normal Distribution. Graph the function on the graphing calculator Identify the x and y intercepts Identify the relative minimums.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
Chapter 3.3 – 3.4 Applications of the Standard Deviation and Measures of Relative Standing.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 4: Measures of Central Tendency. Measures of central tendency are important descriptive measures that summarize a distribution of different categories.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.
Section 2.1 Visualizing Distributions: Shape, Center, and Spread.
Displaying Data with Graphs
1st Semester Final Review Day 1: Exploratory Data Analysis
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
Description of Data (Summary and Variability measures)
The Practice of Statistics, Fourth Edition.
Chapter 3 Describing Data Using Numerical Measures
Analyze Data: IQR and Outliers
Measures of central tendency
Number of Hours of Service
The Range Chapter Data Analysis Learning Goal: To be able to describe the general shape of a distribution in terms of its.
Measures of central tendency
MCC6.SP.5c, MCC9-12.S.ID.1, MCC9-12.S.1D.2 and MCC9-12.S.ID.3
Advanced Algebra Unit 1 Vocabulary
Describing Data Coordinate Algebra.
Analyze Data: IQR and Outliers
Presentation transcript:

Data and Variation

Ways to Represent Data… There are quite a few! Let’s look at a few that we have seen, along with some that we saw in previous years.

Pie Charts

Raw Data Here are all the first quiz scores for the 200 students enrolled in Algebra I. How’d they do?

Put them in order. How’d they do?

Stem-and-Leaf Plot How’d they do?

Frequency Histogram How’d they do?

Same data, different histogram How’d they do?

Measures of Central Tendency What is the “average” versus the average? Average can mean different things! MEAN: the average of an entire set of data MEDIAN: the data point in the middle when a data set is ordered from lowest to highest MODE: the most common occurring data value(s)

Each one can be used in any situation but it can be misleading or not give you an accurate picture of the entire data set. If you want to find the average price to fill your tank with gas? If you want to find the average salary of graduates of your school? If you want to find the average number of pets in a family? If you want to find the average test score?

Variation 2000 Batting Averages 1920 Batting Averages Highest was 0.372 1920 Batting Averages Highest was over 0.400 and 2 players were in the 0.380s

What do you see? 2000 Batting Averages Not much variation in data More variation in data

Measuring Variation Five-Number Summary Minimum Value Maximum Value Median Value of all data Median of Bottom Half of Data (1st quartile) Median of Top Half of Data (3rd quartile)

Box and Whisker Plots Here is a plot of the exam data from before. Dots are outliers (more than 1.5 times the distance from Q1 to Q3). How’d they do?

Accuracy in Measurement 100 people are given a new fancy laser that will measure a persons’ height. Here are the results when 100 people measured the same girl.

Measuring Variation Calculate the Mean. Find out how far each value is from the mean. How far on average is each value from the mean? This is called the deviation from the mean.

Standard Deviation  

Look back at our data… The standard deviation of the height of the girl was 0.2”. The standard deviation of 1920 batting averages is 0.050 and of 2000 batting averages is 0.038. Smaller standard deviation implies the data is more tightly grouped. The standard deviation of exam scores is 14.782. (Large due to outliers that affect the mean as well.)

Shapes of Graphs Graphs can be skewed one direction or the other. Graphs of batting averages and height were symmetrical around the central value. Exam scores were not symmetrical since most students scored higher. This is skewed to the left (where the tail is). A graph skewed to the right means the tail is on the right side of the graph.

Salaries at Corporations They are skewed to the right. Fewer people at the top of the ladder who make the most money. Because it is skewed to the right, this means that the mean is HIGHER than the median. Median is best for describing the average employee salary, while Mean is best when doing payroll calculations and budgets.

Housing Prices Skewed to the right. Mean pulled in direction of skew relative to median. Mean is HIGHER than median.

Exam scores Data is skewed to the left. Mean is LOWER than median.

Example #3 The following histogram shows the exam scores for 30 students in a freshman accounting class. Estimate the mean of these scores. Is the standard deviation of these scores likely to be closer to 12 or to 25?

Answer to Example #3 The mean score is approximately 70 The standard deviation is more likely to be closer to 12 because about half of the scores are within 10 of 70 and the other half are further than 10 but less than 30 away therefore it seems more likely that the standard deviation would average out to close to 12 rather than 25.

SAT Scores What do you see? Bimodal distribution – often experienced on test scores. Students who know what they are doing come exam time and students who do NOT know.

Uniform Distributions All are around 166 times. Theoretically, it should be 166 2/3 times, but that is impossible for real data.

The Bell Curve Most famous of the shapes is the bell-shaped curve, aka normal curve, aka normal distribution, aka Gaussian distribution. Appears often in nature and in mathematics. Lots of formulas to describe it and analyze it. Let’s look at some examples!

Why should we expect bells? Around the mean, there should be an expected amount of variation above and below. The more the variation, the less likely it is. Thus we have a cluster in the middle and approximately the same in high and low ends.

Normal Curves and Standard Deviation 68% of the data differ from the mean by less than one standard deviation. 95% of the data differ from the mean by less than two standard deviations. 99.7% of the data differ from the mean by less than three standard deviations.

Example #1 All freshmen entering NHS have their heads measured for the beanies they are required to wear. One year the head circumference data had a normal distribution with mean 55 cm and standard deviation 1.7 cm. What percentage of the students that year had a head circumference between 53.3 cm and 56.7 cm? What percentage had circumference above 58.4 cm?

Answer to Example #1 For data with a normal distribution, about 68% of the values differ from the mean by less than one standard deviation. The normally distributed head measurements have mean 55 cm and standard deviation 1.7 cm, so heads within one standard deviation of the mean will measure between 55 - 1.7 = 53.3 cm and 55 + 1.7 = 56.7 cm. Thus approximately 68% of the freshmen have head circumferences between 53.3 and 56.7 cm. A head measuring more than 58.4 cm is more than 3.4 cm, or two standard deviations, above the mean. For the second question, recall that approximately 95% of the values in a normal distribution are within two standard deviations, so only 5% lie above or below those limits. Thus, in this case, roughly 5%/2 = 2.5% of the freshmen will have head circumferences measuring more than 58.4 cm.

Example #2 The average high temperature in Anchorage, Alaska, in January is 21ºF with a standard deviation of 10º. The average high temperature in Honolulu in January is 80ºF with a standard deviation of 8º. In which location would it be more unusual to have a day in January with a high of 57ºF?

Answer to #2 A January temperature of 57° would be more unusual in Anchorage. This temperature is within three standard deviations (3 * 8° = 24°) of the mean (80°) in Honolulu but is outside the range of three standard deviations (3 * 10° = 30°) of the mean (21°) in Anchorage.