Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.

Slides:



Advertisements
Similar presentations
Descriptive Measures MARE 250 Dr. Jason Turner.
Advertisements

1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Numerically Summarizing Data
Descriptive Statistics
Measures of Dispersion or Measures of Variability
Calculating & Reporting Healthcare Statistics
DESCRIBING DATA: 2. Numerical summaries of data using measures of central tendency and dispersion.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Measures of Dispersion CJ 526 Statistical Analysis in Criminal Justice.
Biostatistics Unit 2 Descriptive Biostatistics 1.
Slides by JOHN LOUCKS St. Edward’s University.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 3-1 Introduction to Statistics Chapter 3 Using Statistics to summarize.
Data observation and Descriptive Statistics
Variability Ibrahim Altubasi, PT, PhD The University of Jordan.
Central Tendency and Variability Chapter 4. Central Tendency >Mean: arithmetic average Add up all scores, divide by number of scores >Median: middle score.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 4 Summarizing Data.
Describing Data: Numerical
Describing distributions with numbers
Measurement Tools for Science Observation Hypothesis generation Hypothesis testing.
Descriptive Statistics Used to describe the basic features of the data in any quantitative study. Both graphical displays and descriptive summary statistics.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 3 – Descriptive Statistics
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
1.3 Psychology Statistics AP Psychology Mr. Loomis.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Variability The goal for variability is to obtain a measure of how spread out the scores are in a distribution. A measure of variability usually accompanies.
Chapter 4 Variability. Variability In statistics, our goal is to measure the amount of variability for a particular set of scores, a distribution. In.
Why statisticians were created Measure of dispersion FETP India.
1 1 Slide © 2003 Thomson/South-Western. 2 2 Slide © 2003 Thomson/South-Western Chapter 3 Descriptive Statistics: Numerical Methods Part A n Measures of.
Measures of Dispersion
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
Descriptive Statistics: Numerical Methods
1 PUAF 610 TA Session 2. 2 Today Class Review- summary statistics STATA Introduction Reminder: HW this week.
Describing distributions with numbers
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
INVESTIGATION 1.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Measures of Central Tendency: The Mean, Median, and Mode
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2006 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE.
Chapter 3, Part A Descriptive Statistics: Numerical Measures n Measures of Location n Measures of Variability.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Lecture 4 Dustin Lueker.  The population distribution for a continuous variable is usually represented by a smooth curve ◦ Like a histogram that gets.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
Business Statistics, 4e, by Ken Black. © 2003 John Wiley & Sons. 3-1 Business Statistics, 4e by Ken Black Chapter 3 Descriptive Statistics.
Introduction to statistics I Sophia King Rm. P24 HWB
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
CHAPTER 2: Basic Summary Statistics
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Descriptive Statistics(Summary and Variability measures)
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Descriptive Statistics ( )
Mathematical Presentation of Data Measures of Dispersion
Numerical Measures: Centrality and Variability
Descriptive Statistics
Description of Data (Summary and Variability measures)
Numerical Descriptive Measures
Descriptive Statistics
Central tendency and spread
Numerical Descriptive Measures
St. Edward’s University
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
CHAPTER 2: Basic Summary Statistics
Presentation transcript:

Measures of Central Tendency and Dispersion

Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal Inter-quartile range?MedianSkewed ?Mean Median Exponential or logarithmic

NORMAL DISTRIBUTION

Bell-shaped: specific shape that can be defined as an equation Symmetrical around the mid point, where the greatest frequency if scores occur Asymptotes of the perfect curve never quite meet the horizontal axis Normal distribution is an assumption of parametric testing Frequency Distribution: the Normal Distribution

Frequency Distribution: Different Distribution shapes

Measure of Central Tendency  Mean  Median  Mode

Mean  It is computed by summing up all the observations in the variable & dividing the sum by the number of observations.  Mean (Average) = Sum of the Observation values Number of observations  The mean is the most commonly used measure since it takes into account each observation  It is problems:  It considers all observation and it is affected by all observations not preferred in the presence of dispersed values like salaries.

Mean (Average)  Mean (Average) =  Sum of the Observation values Number of observations  In this observation set (5, 3, 9, 7, 1, 3, 6, 8, 2, 6, 6)  Sum = 56Number of observations = 11 Mean = 5.1

Weighted Mean Village No. of Children Mean age (month) (n1 X x1) + (n2 X x2) Weighted Mean = N

Geometric Mean  Mean of a set of data measured on a logarithmic scale.  Logarithmic scale is used when data are not normally distributed & follow an exponential pattern (1,2,4,8,16) or logarithmic pattern (1/2,1/4,1/8 … )  Geometric mean equals:  Anti Log for average of sum log of the values  Or: Anti Log (1/n ∑ Log Xi)  So to calculate the Geometric mean  1-calculate sum of the logarithm of each value  2-calculate average by dividing sum of Log values by number of these values  3-calculating of the anti log will give the geometric mean

Geometric Mean Titre DilutionSample 41: : : : : : :512 7

Geometric Mean  Calculate the geometric mean: 1-Sum of Log (4, 256, 2, 16, 64, 32, 512) = Average = / 7 = Anti Log average =32  Accordingly geometric mean =32  Geometric mean is important in statistical analysis of data following the previous described distribution such as sero survey where titer is calculated for different samples.

Median  Median: Value that divides a distribution into two equal parts.  Arrange the observation by order 1,2,3,3,5, 6,6,6,7,8,9.  When the number is odd  Median = No. + 1 = 11+1 =  So, median is the 6th observations = 6  The median is the best measure when the data is skewed or there are some extreme values

Median  When number is even 1,2,3,3,5,6,6,6,7,8.  Number of observations = 10  Median=  5th observation + 6th observation  2  5+6 = 11 = 5.5  2 2

Mode  Mode: The most frequent value. (5, 3, 9, 7, 1, 6, 8, 2, 6, 6)  " 6" is the most frequent value. Bimodal distribution is referred to presence of two most frequent values.  If all values are different there is no mode.  Not useful when there are several values that occur equally often in a set

Central Tendencies & Distribution Shape  The mean is > media when the curve is negatively skewed to left  The mean is < median when the curve is positively skewed to right  The mean, median and mode are equal when distribution is symmetrical.  The mean is equal to median when it is symmetrical

Measures of Dispersion (Variation) (Indicate spread of value) The observations whether homogenous or heterogeneous, the variability of the observations 1.Range 2.Variance 3.Standard deviation 4.Coefficient of variation 5.Standard error 6.Percentiles & quartiles

Describing Variability: the Range Simplest & most obvious way of describing variability Range = Highest - Lowest  The range only takes into account the two extreme scores and ignores any values in between.  To counter this there the distribution is divided into quarters (quartiles). Q1 = 25%, Q2 =50%, Q3 =75% The Interquartile range: the distance of the middle two quartiles (Q3 – Q1) The Semi-Interquartile range: is one half of the Interquartile range

Measures of Dispersion (Variation) (Indicate spread of value)  The observations whether homogenous or heterogeneous, the variability of the observations 1.Range  The range is the difference between the largest and the smallest observations.  Range = maximum – minimum  Disadvantage: it depends only on two values & doesn ’ t take into account other observations

Measures of Dispersion (Variation) (Indicate spread of value) 2.Variance  It measures the spread of the observations around the mean.  If the observations are close to their mean, the variance is small, otherwise the variance is large.  Variance = S 2 =

Describing Variability: Deviation A more sophisticated measure of variability is one that shows how scores cluster around the mean  Deviation is the distance of a score from the mean X - , e.g = 3.65, 3 – 6.35 =  A measure representative of the variability of all the scores would be the mean of the deviation scores (X - ) Add all the deviations and divide by n n However the deviation scores add up to zero (as mean serves as balance point for scores)

Describing Variability: Variance To remove the +/- signs we simply square each deviation before finding the average. This is called the Variance: (X - ) ² = = 5.33 n 20 The numerator is referred to as the Sum of Squares (SS): as it refers to the sum of the squared deviations around the mean value

Describing Variability: Population Variance Population variance is designated by  ²  ² = (X - ) ² = SS N N Sample Variance is designated by s ²  Samples are less variable than populations: they therefore give biased estimates of population variability  Degrees of Freedom (df): the number of independent (free to vary) scores. In a sample, the sample mean must be known before the variance can be calculated, therefore the final score is dependent on earlier scores: df = n -1 s ² = (X - M) ² = SS = = 5.61 n - 1 n

Describing Variability: the Standard Deviation Variance is a measure based on squared distances In order to get around this, we can take the square root of the variance, which gives us the standard deviation Population () & Sample (s) standard deviation  = (X - ) ² N s = (X - M) ² n - 1 So for our memory score example we simple take the square root of the variance: =  5.61 = 2.37

Measures of Dispersion (Variation) 3.Standard deviation (SD)  It is the square root of the variance S = Both variance & SD are measures of variation in a set of data. The larger they are the more heterogeneous the distribution. SD is more preferred than other measures of variation.  Usually about 70% of the observations lie within one SD of their mean and about 95% lie within two SD of the mean  If we add or subtract a constant from all observations, the changed by the same constant, but the SD does not change  If we multiply or divide all the observation by the same constant, both mean & SD changed by the same amount  Small SD, the bell is tall & narrow  Large SD, the bell is short & broad

Standard Deviation (SD) Example: Calculate SD for this observation set: (7,3,4,6) (Deviation) 2 (Xi – X) 2 Deviation from mean (Xi – X) Value Xi Mean (X) = 20 = 5 Mean of (Dev.) 2 = 10 = SD = = 1.6

Measures of Dispersion (Variation) 4.Coefficient of variation  C.V expresses the SD as a percentage of the sample mean  C.V = * 100  C.V = It is used to compare the relative variation of uncorrelated quantities (blood glucose & cholesterol level)

Measures of Dispersion (Variation) 5.Standard error  SE measures how precisely the pp mean is estimated by sample mean. The size of SE depends both on how much variation there is in the pp and on the size of the sample.  SE =  SE = If the SE is large, sample is not precise to estimate the pp.

Describing Variability Describes in an exact quantitative measure, how spread out/clustered together the scores are Variability is usually defined in terms of distance  How far apart scores are from each other  How far apart scores are from the mean  How representative a score is of the data set as a whole

Quartiles & Interquartiles  The age range of this group of 18 students is 55 – 25 = 30 years  If the older student was not present, the range would have been 45 – 25 = 20 years  This means that a single value could give non-real wide range of the groups age  Since we can not ignore a single value and we do not want to give wrong impression, we estimate the interquartile range

Quartiles & Interquartiles Third quartile Fourth quartileFirst quartiles Second quartile Interquartile range  The values are arranged in ascending manner  The groups then divided into 4 equal parts, each part contain one  quarter of observations  In the below example, 18/4 = 4.5 individuals  The value of the fifth individual is the minimum value of the  interquartile range  As a general rule, when the product of division contains a fraction then  take the following individual ’ s value (4.5, take the value of the fifth)  Interquartile range = 42 – 32 = 10 years

Percentiles  Used when the number of observations is large  The values are arranged in ascending manner  When the individuals are hundred, the lowest value will be 1st percentile and the highest will be the 100 th percentiles.