N.D. Analysis We are NOT going to analyze Notre Dame, just Numerical (Quantitative) Data. There are essentially two ways to represent and analyze quantitative.

Slides:



Advertisements
Similar presentations
HS 67 - Intro Health Statistics Describing Distributions with Numbers
Advertisements

Introduction to Summary Statistics
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Measures of Central Tendency 3.1. ● Analyzing populations versus analyzing samples ● For populations  We know all of the data  Descriptive measures.
Today: Central Tendency & Dispersion
Describing Data: Numerical
Chapter 3 Descriptive Measures
Chapter 1: Exploring Data AP Stats, Questionnaire “Please take a few minutes to answer the following questions. I am collecting data for my.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Created by Tom Wegleitner, Centreville, Virginia Section 3-1 Review and.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
1 Review Sections Descriptive Statistics –Qualitative (Graphical) –Quantitative (Graphical) –Summation Notation –Qualitative (Numerical) Central.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Central Tendency & Dispersion
Chapter 5 Describing Distributions Numerically.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
Centrality revisited We have already seen how to compute the median. If we use the median as an axe we cut the data into two halves, each with an equal.
1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.
Numerical parameters of a Random Variable Remember when we were studying sets of data of numbers. We found some numbers useful, namely The spread The.
INTRODUCTION TO STATISTICS
Introduction to Summary Statistics
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
Introduction to Summary Statistics
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Measures of central tendency
Introduction to Summary Statistics
Please take out Sec HW It is worth 20 points (2 pts
Introduction to Summary Statistics
Histograms: Earthquake Magnitudes
Introduction to Summary Statistics
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
CHAPTER 1 Exploring Data
Displaying and Summarizing Quantitative Data
Statistics: The Interpretation of Data
Means & Medians.
Describing Quantitative Data with Numbers
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Introduction to Summary Statistics
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Summary (Week 1) Categorical vs. Quantitative Variables
Summary (Week 1) Categorical vs. Quantitative Variables
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Introduction to Summary Statistics
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Introduction to Summary Statistics
CHAPTER 1 Exploring Data
Advanced Algebra Unit 1 Vocabulary
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

N.D. Analysis We are NOT going to analyze Notre Dame, just Numerical (Quantitative) Data. There are essentially two ways to represent and analyze quantitative data, 1.visually 2.computationally We already have seen the four main ways to represent numerical data visually. ° dot plots ° pie charts ° histograms ° stem and leaf

For the sake of completeness we look at an example of each: Dot Plot:

Histogram

Pie Chart

Stem and Leaf

There is a fifth way, very inventive, called a box plot we will learn it a little later. We have to learn first how to work Computationally with a collection of numerical data. First of all we want to distinguish results computed from collections that come from a whole population from results computed from collections that come from a sample

1.Results from populations are called parameters 2.Results from samples are called statistics The rule is easy to remember: P opulation  P arameter S ample  S tatistic (the PP-SS rule) With one exception, the calculations one does are the same, whether one is computing a parameter or a statistic. The exception will come later.

What kind of results will we be looking for? In other words, what calculations can you do on a given set of numbers? The list is endless, we have seen the following mentioned in class: Range, maximum, minimum, average, bidding method, variance, standard deviation, … A long standing convention stipulates that symbols used for parameters are Greek letters symbols used for statistics are latin letters A wonderful opportunity to learn some Greek letters! We will mostly use two: and

Important Calculations Time to start computing things. From now on, until stated otherwise, we assume we have a given set of numerical data x 1, x 2, x 3,…, x n We also assume that we have already taken the trouble to order the data, that is x 1 ≤ x 2 ≤ x 3 ≤ … ≤ x n We will be interested in “measuring” two concepts:

“Centrality” and “Spread” We measure “centrality” by trying to compute the “center” of the data. The trouble is that we don’t know what we mean by that’s right, the center ! Intuitively the center of two data, say 7 and 21, should be the midpoint, 14 (you don’t need your TI84 plus for this, I hope! ) But what if the data are 7, 7 and 21 ?

Most of you already know the answer …. That’s right, the average ! By the way, another word for average (more impressive!) is “mean”. Notation: The symbol for the sample mean is The symbol for the population mean is Using the notation we learned in the last class we can write: The mean has one serious defect:

It is extremely sensitive to extreme values. The following set of data ,000,000 ought to have a center kind of close to 8 (there are 15 of them!) Instead your TI84 plus will tell you that the mean is 62, (wow !) A more faithful (but harder to compute in general, and computationally less useful) measure of centrality is the median

You know the definition, essentially it is a number that cuts the ordered data into two equal halves. More precisely, If you have an odd number of data (think of 37), you pick the one smack in the middle (the 19-th, think about it!) If you have an even number of data (think of 46) you pick any number you please …. between the 23-rd and 24-th entries (if they are equal you got no choice!) We do a couple of examples on the board. Then one on the computer.

Distributional Symmetry (or lack thereof !) What’s the relationship, if any, between the medianandthe mean?? Just for fun let’s call the median M d and the mean M n (clever, huh ?). Then either M n M d Now, by its very definition M d is in the Middle ! On the other hand, M n is the average, the middle value. When the data are distributed in perfect symmetry we must therefore have M n = M d What if M n < M d ? It means that somehow the average is pulled down to the left of the middle. There are more extreme values to the left. We say that the data are skewed to the left.

In the same way, if M n > M d it means that somehow the average is pulled up to the right of the middle. There are more extreme values to the right. We say that the data are skewed to the right. Of course, there can’t be too many of these extreme values, on either side, because in that case the median is shifted. In terms of histograms, we get these four cases, shown on the next slides.

Symmetric distributions

Distributions like the this one (symmetric and bunched around the middle) are said to be “nicely mound-shaped.” We’ll say more about them later.