MM150 ~ Unit 9 Statistics ~ Part II. WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores.

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Measures of Dispersion
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 3-1 Statistics for Managers Using Microsoft® Excel 5th Edition.
Basic Business Statistics 10th Edition
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
12.3 – Measures of Dispersion
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Chapter 13 Section 5 - Slide 1 Copyright © 2009 Pearson Education, Inc. AND.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Statistics for Managers.
Numerical Descriptive Techniques
Chapter 3 – Descriptive Statistics
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Chapter 13 Section 7 – Slide 1 Copyright © 2009 Pearson Education, Inc. AND.
Slide 13-1 Copyright © 2005 Pearson Education, Inc. SEVENTH EDITION and EXPANDED SEVENTH EDITION.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Review Measures of central tendency
Variation This presentation should be read by students at home to be able to solve problems.
1 CHAPTER 3 NUMERICAL DESCRIPTIVE MEASURES. 2 MEASURES OF CENTRAL TENDENCY FOR UNGROUPED DATA  In Chapter 2, we used tables and graphs to summarize a.
Dr. Serhat Eren 1 CHAPTER 6 NUMERICAL DESCRIPTORS OF DATA.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Business Statistics, A First Course.
 IWBAT summarize data, using measures of central tendency, such as the mean, median, mode, and midrange.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures (Summary Measures) Basic Business Statistics.
1 Descriptive Statistics Descriptive Statistics Ernesto Diaz Faculty – Mathematics Redwood High School.
Summary Statistics: Measures of Location and Dispersion.
Can't Type? press F11 Can’t Hear? Check: Speakers, Volume or Re-Enter Seminar Put ? in front of Questions so it is easier to see them. 1 Graded Projects.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 3-1 Chapter 3 Numerical Descriptive Measures Basic Business Statistics 11 th Edition.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Statistics Unit 9 only requires us to do Sections 1 & 2. * If we have time, there are some topics in Sections 3 & 4, that I will also cover. They tie in.
Statistics topics from both Math 1 and Math 2, both featured on the GHSGT.
Adapted from Pearson Education, Inc. Copyright © 2009 Pearson Education, Inc. Welcome to MM150! Kirsten Meymaris Thursday, Mar. 31st Plan for the hour.
Chapter 2 Describing and Presenting a Distribution of Scores.
Statistics Josée L. Jarry, Ph.D., C.Psych. Introduction to Psychology Department of Psychology University of Toronto June 9, 2003.
Statistical Methods © 2004 Prentice-Hall, Inc. Week 3-1 Week 3 Numerical Descriptive Measures Statistical Methods.
Copyright © 2016 Brooks/Cole Cengage Learning Intro to Statistics Part II Descriptive Statistics Intro to Statistics Part II Descriptive Statistics Ernesto.
Slide Copyright © 2009 Pearson Education, Inc. Unit 9 Seminar Agenda Final Project and Due Dates Measures of Central Tendency Measures of Dispersion.
MM150 Unit 9 Seminar. 4 Measures of Central Tendency Mean – To find the arithmetic mean, or mean, sum the data scores and then divide by the number of.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 2 Describing and Presenting a Distribution of Scores.
Statics – Part II Chapter 9. Mean The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is where.
Slide Copyright © 2009 Pearson Education, Inc. Types of Distributions Rectangular Distribution J-shaped distribution.
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Copyright © 2009 Pearson Education, Inc. Chapter 13 Section 5 - Slide 1 Section 5 Measures of Central Tendency.
AND.
Descriptive Statistics Ernesto Diaz Faculty – Mathematics
Descriptive Statistics ( )
Department of Mathematics
Statistics for Managers Using Microsoft® Excel 5th Edition
Business and Economics 6th Edition
Intro to Statistics Part II Descriptive Statistics
AND.
Intro to Statistics Part II Descriptive Statistics
Section 13.7 Linear Correlation and Regression
Numerical Descriptive Measures
BUS173: Applied Statistics
Measures of Dispersion
7-7 Statistics The Normal Curve.
Linear Correlation and Regression
Numerical Descriptive Measures
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Section 13.6 The Normal Curve
Business and Economics 7th Edition
Section 13.5 Measures of Dispersion
Presentation transcript:

MM150 ~ Unit 9 Statistics ~ Part II

WHAT YOU WILL LEARN Mode, median, mean, and midrange Percentiles and quartiles Range and standard deviation z-scores and the normal distribution Correlation and regression

9.1 Measures of Central Tendency

Definitions An average is a number that is representative of a group of data. The arithmetic mean, or simply the mean is symbolized by, when it is a sample of a population or by the Greek letter mu, , when it is the entire population.

Mean The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is where represents the sum of all the data and n represents the number of pieces of data.

Example-find the mean Find the mean amount of money parents spent on new school supplies and clothes if 5 parents randomly surveyed replied as follows: $327 $465 $672 $150 $230

Median The median is the value in the middle of a set of ranked data. Example: Determine the median of $327 $465 $672 $150 $230. Rank the data from smallest to largest. $150 $230 $327 $465 $672 middle value (median)

Example: Median (even data) Determine the median of the following set of data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4. Rank the data: There are 10 pieces of data so the median will lie halfway between the two middle pieces the 7 and 8. The median is (7 + 8)/2 = (median) middle value

Mode The mode is the piece of data that occurs most frequently. Example: Determine the mode of the data set: 3, 4, 4, 6, 7, 8, 9, 11, 12, 15. The mode is 4 since it occurs twice and the other values only occur once.

Midrange The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data. Example: Find the midrange of the data set $327, $465, $672, $150, $230.

Example The weights of eight Labrador retrievers rounded to the nearest pound are 85, 92, 88, 75, 94, 88, 84, and 101. Determine the a) mean b) median c) mode d) midrange e) rank the measures of central tendency from lowest to highest.

Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 a. Mean b. Median-rank the data 75, 84, 85, 88, 88, 92, 94, 101 The median is 88.

Example--dog weights 85, 92, 88, 75, 94, 88, 84, 101 c.Mode-the number that occurs most frequently. The mode is 88. d. Midrange = (L + H)/2 = ( )/2 = 88 e. Rank the measures, lowest to highest 88, 88, 88,

Measures of Position Measures of position are often used to make comparisons. Two measures of position are percentiles and quartiles.

To Find the Quartiles of a Set of Data 1.Order the data from smallest to largest. 2.Find the median, or 2 nd quartile, of the set of data. If there are an odd number of pieces of data, the median is the middle value. If there are an even number of pieces of data, the median will be halfway between the two middle pieces of data.

To Find the Quartiles of a Set of Data continued 3.The first quartile, Q 1, is the median of the lower half of the data; that is, Q 1, is the median of the data less than Q 2. 4.The third quartile, Q 3, is the median of the upper half of the data; that is, Q 3 is the median of the data greater than Q 2.

Example: Quartiles The weekly grocery bills for 23 families are as follows. Determine Q 1, Q 2, and Q

Example: Quartiles continued Order the data: Q 2 is the median of the entire data set which is 190. Q 1 is the median of the numbers from 50 to 172 which is 95. Q 3 is the median of the numbers from 210 to 330 which is 270.

9.2 Measures of Dispersion

Measures of dispersion are used to indicate the spread of the data. The range is the difference between the highest and lowest values; it indicates the total spread of the data. Range = highest value – lowest value

Example: Range Nine different employees were selected and the amount of their salary was recorded. Find the range of the salaries. $24,000$32,000 $26,500 $56,000 $48,000 $27,000 $28,500 $34,500 $56,750 Range = $56,750  $24,000 = $32,750

Standard Deviation The standard deviation measures how much the data differ from the mean. It is symbolized with s when it is calculated for a sample, and with  (Greek letter sigma) when it is calculated for a population.

To Find the Standard Deviation of a Set of Data 1. Find the mean of the set of data. 2. Make a chart having three columns: Data Data  Mean (Data  Mean) 2 3. List the data vertically under the column marked Data. 4. Subtract the mean from each piece of data and place the difference in the Data  Mean column.

To Find the Standard Deviation of a Set of Data continued 5.Square the values obtained in the Data  Mean column and record these values in the (Data  Mean) 2 column. 6.Determine the sum of the values in the (Data  Mean) 2 column. 7.Divide the sum obtained in step 6 by n  1, where n is the number of pieces of data. 8.Determine the square root of the number obtained in step 7. This number is the standard deviation of the set of data.

Example Find the standard deviation of the following prices of selected washing machines: $280, $217, $665, $684, $939, $299 Find the mean.

Example continued, mean = , , , , ,225  ,756  (  297) 2 = 88,209  (Data  Mean) 2 Data  Mean Data

Example continued, mean = 514 The standard deviation is $

9.3 The Normal Curve

Types of Distributions Rectangular Distribution J-shaped distribution

Types of Distributions continued Bimodal Skewed to right

Types of Distributions continued Skewed to left Normal

Properties of a Normal Distribution The graph of a normal distribution is called the normal curve. The normal curve is bell shaped and symmetric about the mean. In a normal distribution, the mean, median, and mode all have the same value and all occur at the center of the distribution.

Empirical Rule Approximately 68% of all the data lie within one standard deviation of the mean (in both directions). Approximately 95% of all the data lie within two standard deviations of the mean (in both directions). Approximately 99.7% of all the data lie within three standard deviations of the mean (in both directions).

z-Scores z-scores determine how far, in terms of standard deviations, a given score is from the mean of the distribution.

Example: z-scores A normal distribution has a mean of 50 and a standard deviation of 5. Find z-scores for the following values. a) 55b) 60c) 43 a) A score of 55 is one standard deviation above the mean.

Example: z-scores continued b) A score of 60 is 2 standard deviations above the mean. c) A score of 43 is 1.4 standard deviations below the mean.

To Find the Percent of Data Between any Two Values 1. Draw a diagram of the normal curve, indicating the area or percent to be determined. 2.Use the formula to convert the given values to z-scores. Indicate these z- scores on the diagram. 3. Look up the percent that corresponds to each z-score in Table 13.7.

To Find the Percent of Data Between any Two Values continued 4. a) When finding the percent of data between two z- scores on opposite sides of the mean (when one z-score is positive and the other is negative), you find the sum of the individual percents. b) When finding the percent of data between two z- scores on the same side of the mean (when both z-scores are positive or both are negative), subtract the smaller percent from the larger percent.

To Find the Percent of Data Between any Two Values continued c) When finding the percent of data to the right of a positive z-score or to the left of a negative z-score, subtract the percent of data between 0 and z from 50%. d) When finding the percent of data to the left of a positive z-score or to the right of a negative z- score, add the percent of data between 0 and z to 50%.

Example Assume that the waiting times for customers at a popular restaurant before being seated for lunch are normally distributed with a mean of 12 minutes and a standard deviation of 3 min. a)Find the percent of customers who wait for at least 12 minutes before being seated. b)Find the percent of customers who wait between 9 and 18 minutes before being seated. c)Find the percent of customers who wait at least 17 minutes before being seated. d)Find the percent of customers who wait less than 8 minutes before being seated.

Solution a. wait for at least 12 minutes Since 12 minutes is the mean, half, or 50% of customers wait at least 12 min before being seated. b. between 9 and 18 minutes Use table 9.4 page % % = 81.8%

Solution continued c. at least 17 min Use table 9.4 page % is between the mean and %  45.3% = 4.7% Thus, 4.7% of customers wait at least 17 minutes. d. less than 8 min Use table 9.4 page % is between the mean and  %  40.8% = 9.2% Thus, 9.2% of customers wait less than 8 minutes.

9.4 Linear Correlation and Regression

Linear Correlation Linear correlation is used to determine whether there is a relationship between two quantities and, if so, how strong the relationship is.

Linear Correlation – The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables. If the value is positive, as one variable increases, the other increases. If the value is negative, as one variable increases, the other decreases. The variable, r, will always be a value between –1 and 1 inclusive.

Scatter Diagrams A visual aid used with correlation is the scatter diagram, a plot of points (bivariate data). – The independent variable, x, generally is a quantity that can be controlled. – The dependent variable, y, is the other variable. The value of r is a measure of how far a set of points varies from a straight line. – The greater the spread, the weaker the correlation and the closer the r value is to 0. – The smaller the spread, the stronger the correlation and the closer the r value is to 1.

Correlation

Linear Correlation Coefficient The formula to calculate the correlation coefficient (r) is as follows:

There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes. Example: Words Per Minute versus Mistakes 934Nancy 1041Kendra 1253Phillip 1167George 824Ellen MistakesWords per MinuteApplicant

We will call the words typed per minute, x, and the mistakes, y. List the values of x and y and calculate the necessary sums. Solution xy = 2,281 y 2 = 510 x 2 =10,711 y =50 x= y Mistakes xyy2y2 x2x2 x WPM

Solution continued The n in the formula represents the number of pieces of data. Here n = 5.

Solution continued

Since 0.86 is fairly close to 1, there is a fairly strong positive correlation. This result implies that the more words typed per minute, the more mistakes made.

Linear Regression Linear regression is the process of determining the linear relationship between two variables. The line of best fit (regression line or the least squares line) is the line such that the sum of the squares of the vertical distances from the line to the data points (on a scatter diagram) is a minimum.

The Line of Best Fit Equation:

Example Use the data in the previous example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart. Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points.

Solution From the previous results, we know that

Solution Now we find the y-intercept, b. Therefore the line of best fit is y = 0.081x

Solution continued To graph y = 0.081x , plot at least two points and draw the graph yx

Solution continued