Measures of Central Tendency

Slides:



Advertisements
Similar presentations
Lesson Describing Distributions with Numbers parts from Mr. Molesky’s Statmonkey website.
Advertisements

Descriptive Measures MARE 250 Dr. Jason Turner.
CHAPTER 1 Exploring Data
Measures of Dispersion
Numerically Summarizing Data
Descriptive Statistics
Lecture 4 Chapter 2. Numerical descriptors
Looking at data: distributions - Describing distributions with numbers IPS chapter 1.2 © 2006 W.H. Freeman and Company.
1.2 Describing Distributions with Numbers. Center and spread are the most basic descriptions of what a data set “looks like.” They are intuitively meant.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Sullivan – Statistics: Informed Decisions Using Data – 2 nd Edition – Chapter 3 Introduction – Slide 1 of 3 Topic 16 Numerically Summarizing Data- Averages.
Measures of Central Tendency MARE 250 Dr. Jason Turner.
Chapter 3: Descriptive Measures STP 226: Elements of Statistics Jenifer Boshes Arizona State University.
Measures of Central Tendency MARE 250 Dr. Jason Turner.
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Looking at data: distributions - Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Objectives 1.2 Describing distributions with numbers
Numerical Descriptive Techniques
Review Measures of central tendency
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Describing distributions with numbers
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
Categorical vs. Quantitative…
Statistical Reasoning for everyday life Intro to Probability and Statistics Mr. Spering – Room 113.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
To be given to you next time: Short Project, What do students drive? AP Problems.
Chapter 3 Looking at Data: Distributions Chapter Three
 The mean is typically what is meant by the word “average.” The mean is perhaps the most common measure of central tendency.  The sample mean is written.
Numerical descriptors BPS chapter 2 © 2006 W.H. Freeman and Company.
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Numerical descriptors BPS chapter 2 © 2006 W.H. Freeman and Company.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Summary Statistics: Measures of Location and Dispersion.
Chapter 3 Averages and Variation Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
LIS 570 Summarising and presenting data - Univariate analysis.
Chapter 1: Exploring Data, cont. 1.2 Describing Distributions with Numbers Measuring Center: The Mean Most common measure of center Arithmetic average,
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
BPS - 5th Ed.Chapter 21 Describing Distributions with Numbers.
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
Descriptive Statistics ( )
Exploratory Data Analysis
Averages and Variation
Description of Data (Summary and Variability measures)
CHAPTER 1 Exploring Data
Numerical Descriptive Measures
CHAPTER 1 Exploring Data
Descriptive Statistics
Box and Whisker Plots Algebra 2.
DAY 3 Sections 1.2 and 1.3.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Measures of Center.
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
CHAPTER 1 Exploring Data
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
Presentation transcript:

Measures of Central Tendency MARE 250 Dr. Jason Turner

Centracidal Tendencies The measure of central tendency indicates where along the measurement scale the sample or population is located – can be determined via various measures Three most important: Mean Median Mode

Mean Girls Mean – most commonly used measure of center sum of the observations divided by the number of observations

The Median "As we were driving, we saw a sign that said "Watch for Rocks." Martha said it should read "Watch for Pretty Rocks." I told her she should write in her suggestion to the highway department, but she started saying it was a joke - just to get out of writing a simple letter! And I thought I was lazy!“ – Jack Handy The median is typically defined as the middle measurement in an ordered set of data Separates the bottom 50% of the data from the top 50%

The Mode “Oh, no way - where?  Holy crap, he's with a girl! But he's the guy from Depeche Mode!  That's impossible! Come on, he's in Depeche Mode!” - The Monarch The mode is typically defined as the most frequently occurring measurement in a set of data The mode is useful if the distribution is skewed or bimodal (having two very pronounced values around which data are concentrated) 30 Number of Individuals 10 20

You are so totally skewed! The mean is sensitive to extreme (very large or small) observations and the median is not Therefore – you can determine how skewed your data is by looking at the relationship between median and mean Mean is Greater than the Median Mean and Median are Equal Mean is Less Than the Median

Resistance Measures A resistance measure is not sensitive to the influences of a few extreme observations Median – resistant measure of center Mean – not Resistance of Mean can be improved by using – Trimmed Means – a specified percentage of the smallest and largest observations are removed before computing the mean Will do something like this later when exploring the data and evaluating outliers…(their effects upon the mean)

How To on Computer On Minitab: Your data must be in a single column Go to the 'Stat' menu, and select 'Basic stats', then 'Display descriptive stats'. Select your data column in the 'variables' box. The output will generally go to the session window, or if you select 'graphical summary' in the 'graphs' options, it will be given in a separate window. This will give you a number of basic descriptive stats, though not the mode.

Measures of Dispersion and Variability MARE 250 Dr. Jason Turner

Please Disperse! “Alright everyone, disperse immediately. We are prepared to use force a-- what, what? We're not prepared, Eddie? Someone call 911!” – Chief Wiggum Measure of Dispersion of the Data - an indication of the spread of measurements around the center of the distribution 2 of the most frequently used – Range Standard Deviation

The Range Range - the difference between the highest and lowest values in the observations This is useful, but may be misleading when the data has one or more outliers (single measurements that are exceptionally large or small relative to the other data) It is not relative to the central location Range = Max - Min

The Variance Variance - the average of the squared deviations from the mean The most widely used measure of spread, and one that will be used often in various statistical applications

The Variance Degrees of Freedom - quantity (n -1) Used instead of n to provide an unbiased estimate of the population variance As the sample size (n) increases (and n approaches N) Value of the population and sample variance will become more similar

Standard Deviation Standard Deviation – the positive square root of the variance Indicates how far (on average) the observations in the sample are from the mean of the sample The more variation in a data set, the larger its standard deviation

Quartiles Quartiles – into quarters – 4 equal parts Median divides data into 2 equal parts: 50% bottom, 50% top Quartiles – into quarters – 4 equal parts A dataset has 3 quartiles: Q1 – is the number that divides the bottom 25% from top 75% Q2 – is the median; bottom 50% from top 50% Q3 – is the number that divides the bottom 75% from top 25%

Quartiles

Interquartile Range Interquartile Range (IQR) – the difference between the first and third quartiles IQR = Q3 – Q1 The IQR gives you the range of the middle 50% of the data

Outlier, Outlier Outliers – observations that fall well outside the overall pattern of the data Requires special attention May be the result of: Measurement or Recording Error Observation from a different population Unusual Extreme observation

Pants on Fire Must deal with outliers: (Yes, really!) If error – can delete; otherwise judgment call Can use quartiles and IQR to identify potential outliers

The Outer Limits Lower and Upper Limits: Lower Limit = Q1 - 1.5 * IQR Lower limit – is the number that lies 1.5 IQR’s below the first quartile Lower Limit = Q1 - 1.5 * IQR Upper limit – is the number that lies 1.5 IQR’s above the first quartile Upper Limit = Q3 + 1.5 * IQR

The Outer Limits Outlier If a value is outside the “Outer Limits” of a dataset it is an Outlier

Five-Number Summary 5-Number Summary: Written in increasing order Min, Q1, Q2, Q3, Max Written in increasing order Provides information on Center and Variation Are used to construct Box-Plots

Boxplots Boxplot (Box-and-Whisker-Design): based on the 5-number summary provide graphic display of the center and variation Q1 Q2 Q3 Min Max 70

Note that Min & Max are determine after outliers are removed! Boxplots Modified Boxplot – includes outliers Potential Outlier * 70 Note that Min & Max are determine after outliers are removed!

Boxplots

Boxplots Boxplots summarize information about the shape, dispersion, and center of your data. They can also help you spot outliers. The left edge of the box represents the first quartile (Q1), while the right edge represents the third quartile (Q3). Thus the box portion of the plot represents the interquartile range (IQR), or the middle 50% of the observations Q1 Q2 Q3 Min Max 70

Boxplots The line drawn through the box represents the median of the data The lines extending from the box are called whiskers. The whiskers extend outward to indicate the lowest and highest values in the data set (excluding outliers) Extreme values, or outliers, are represented by dots. A value is considered an outlier if it is outside of the box (greater than Q3 or less than Q1) by more than 1.5 times the IQR Potential Outlier * 70

Boxplots Use the boxplot to assess the symmetry of the data: If the data are fairly symmetric, the median line will be roughly in the middle of the IQR box and the whiskers will be similar in length If the data are skewed, the median may not fall in the middle of the IQR box, and one whisker will likely be noticeably longer than the other