Download presentation
Presentation is loading. Please wait.
1
Finding Geometric Means Using SAS
Better Than Average Finding Geometric Means Using SAS
2
What are we trying to say with an “average”?
Image: source Expected Value
3
Common Types of “Averages”
Median: Middle element of ordered data Mode: Value most often seen in a data set Main advantages: not influenced by extreme values, can be used for any type of distribution Main disadvantage: Insensitive, gives no information about your distribution Images: source
4
Common Types of “Averages”
Arithmetic Mean: Calculated from the sum of values divided by the number of values in a data set Main advantages: Easy to understand and uses all observations, can give information about your distribution Main disadvantages: Easily skewed by outliers and inaccurate if working with non-Normal data Image: source
5
What is a geometric mean?
Geometric Mean: Calculated by taking the nth root of the product of n positive observations in a data set Main advantages: Precise, but not influenced by extreme values Main disadvantages: More difficult to understand, all values must be non-zero and positive Image: source
6
Geometric Series: Each number increases by the same proportion (3)
(3, 9, 27, 81, 243)
7
When should I use the geometric mean?
Non-Normal/skewed data Ratios or proportions/scaled data Small sample sizes Source
8
Bioassays/dose-response curves Population growth
Image source Image source Compounding interest Image source Decay rates Image source Scaled bioequivalence Image source Survival analysis Image source
9
..but data is messy! Use your judgement in selecting your “average.”
Image source Image source Image source
10
What if my data contains zeroes?
0 ×𝑛=0 Adjust your scale so that you add 1 to every number in the data set, and then subtract 1 from the resulting geometric mean. Ignore zeros or missing data in your calculations. Convert zeros to a very small number (often called “below the detection limit”) that is less than the next smallest number in the data set.
11
What if my data contains negative numbers? −1 =𝑖
If all values are negative, simply convert all values to positive numbers before calculating the geometric mean. Then assign the resulting geometric mean a negative value. If your data set contains both positive and negative values, you will have to separate them and find the geometric means for each group, and you can then find the weighted average of their individual geometric means to find the total geometric mean for the full data set.
12
How can I use SAS to compute geometric means?
Geomean() or Geomeanz() functions PROC SURVEYMEANS Manual calculations Source
13
Finding the geometric mean for an observation/row: Geomean() or Geomeanz() Functions
Returns the geometric mean of a numeric constant, variable, or expression If any arguments are negative, result is a missing value If any arguments are zero, result is zero Fuzzes the values of arguments that are extremely small and approximately zero—if you do not want this, use the geomeanz() function Skips missing values
14
input studyid var1 var2 var3 var4 var5;
DATA my_data; input studyid var1 var2 var3 var4 var5; geometric_mean = geomean(of var1-var5); *Calculates geometric mean; datalines; ; run; PROC PRINT data=my_data; id studyid; Note that you can use “OF” for a list of variables
15
Finding the geometric mean for a population/column: PROC SURVEYMEANS
The geomean option within PROC SURVEYMEANS returns the geometric mean of the specified variables. *Values must be non-zero and positive You can also request confidence limits for the geometric mean GMCLM requests the 2-sided confidence limits LGMCLM requests the 1-sided lower confidence limit UGMCLM requests the 1-sided upper confidence limit
16
PROC SURVEYMEANS data=my_data geomean;
var var1 var2 var3 var4 var5; run;
17
Which measure of variation should I use?
Standard error: how precise is the calculation of the geometric mean Standard deviation: how spread out is the data around the geometric mean Coefficient of variation: how does the variation for this geometric mean compare with another data set
18
Geometric Standard Error
Output by default in PROC SURVEYMEANS
19
Geometric Standard Deviation
Find the natural log of your variable using the log() function in the DATA step: DATA my_data2; set my_data; ln_var1 = log(var1); *Calculates the natural log of variable 1; run;
20
Geometric Standard Deviation
2. Use PROC MEANS to find the arithmetic mean and standard deviation of your newly log-transformed variable: PROC MEANS data=my_data2 mean stddev; *Specifies output; var ln_var1; output out=meansout mean=a_mean stddev=a_stddev; *Creates new data set; run;
21
Geometric Standard Deviation
3. Exponentiate the arithmetic mean and standard deviation to find the geometric mean and geometric standard deviation, using the EXP() function in the DATA step: DATA my_data3; set meansout; geo_mean = exp(a_mean); *Converts to geometric mean; geo_stddev = exp(a_stddev); *Converts to geometric standard deviation; run; PROC PRINT data=my_data3 noobs; var geo_mean geo_stddev;
22
Applying the Geometric Standard Deviation
Geometric Mean ± Geometric Standard Deviation =INCORRECT!
23
Applying the Geometric Standard Deviation
The geometric standard deviation is multiplicative, NOT additive: Lower bound = geometric mean ÷ geometric standard deviation = ÷ = Upper bound = geometric mean x geometric standard deviation = x = Resulting range for one geometric standard deviation is (91.24, )
24
Geometric Coefficient of Variation
𝐺𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝐶𝑉= 𝐺𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑆𝐷 1 𝐺𝑒𝑜𝑚𝑒𝑡𝑟𝑖𝑐 𝑀𝑒𝑎𝑛 Reduce the geometric standard deviation to the power of the reciprocal of the geometric mean in the DATA step: DATA my_data4; set my_data3; geo_cv = geo_stddev**(1/geo_mean); *Calculates geometric CV; run; PROC PRINT data=my_data4 noobs; var geo_cv;
25
To sum up: Use geometric means for data that is lognormal or uses ratios or proportions Make sure your values are non-zero and positive Use your judgement in choosing the mean to express your expected values when working with messy data Source
26
Questions?
27
Contact Information Name:Kimberly Roenfeldt Company: Henry M Jackson Foundation for the Advancement of Military Medicine City/State: San Diego, CA Phone: (619)
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.