Descriptive Statistics Used in Biology. It is rarely practical for scientists to measure every event or individual in a population. Instead, they typically.

Slides:



Advertisements
Similar presentations
SUMMARIZING DATA: Measures of variation Measure of Dispersion (variation) is the measure of extent of deviation of individual value from the central value.
Advertisements

Calculating & Reporting Healthcare Statistics
Statistics Intro Univariate Analysis Central Tendency Dispersion.
Statistics Intro Univariate Analysis Central Tendency Dispersion.
Biostatistics Unit 2 Descriptive Biostatistics 1.
Measures of Variability
Business Math, Eighth Edition Cleaves/Hobbs © 2009 Pearson Education, Inc. Upper Saddle River, NJ All Rights Reserved 7.1 Measures of Central Tendency.
Standard Error for AP Biology
12.3 – Measures of Dispersion
Measures of Variability: Range, Variance, and Standard Deviation
Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________.
6 - 1 Basic Univariate Statistics Chapter Basic Statistics A statistic is a number, computed from sample data, such as a mean or variance. The.
Describing distributions with numbers
Statistical Analysis Statistical Analysis
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 11: Estimation Estimation Defined Confidence Levels
Estimation of Statistical Parameters
BUS250 Seminar 4. Mean: the arithmetic average of a set of data or sum of the values divided by the number of values. Median: the middle value of a data.
Understanding Numerical Data
Beak of the Finch Natural Selection Statistical Analysis.
Statistics Measures Chapter 15 Sections
Describing distributions with numbers
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
 Statistics The Baaaasics. “For most biologists, statistics is just a useful tool, like a microscope, and knowing the detailed mathematical basis of.
Descriptive Statistics: Presenting and Describing Data.
Describing Quantitative Data Numerically Symmetric Distributions Mean, Variance, and Standard Deviation.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Chapter Eight: Using Statistics to Answer Questions.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
PCB 3043L - General Ecology Data Analysis.
Describing Samples Based on Chapter 3 of Gotelli & Ellison (2004) and Chapter 4 of D. Heath (1995). An Introduction to Experimental Design and Statistics.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Descriptive Statistics for one variable. Statistics has two major chapters: Descriptive Statistics Inferential statistics.
CHAPTER 2: Basic Summary Statistics
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
AP Biology Intro to Statistics
Chapter 1: Exploring Data
Standard Error for AP Biology
PCB 3043L - General Ecology Data Analysis.
Standard Error for AP Biology
AP Biology Intro to Statistics
How do we categorize and make sense of data?
Summary descriptive statistics: means and standard deviations:
Standard Error for AP Biology
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Standard Deviation & Standard Error
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
CHAPTER 2: Basic Summary Statistics
Chapter 1: Exploring Data
Chapter Nine: Using Statistics to Answer Questions
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Advanced Algebra Unit 1 Vocabulary
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Descriptive Statistics Used in Biology

It is rarely practical for scientists to measure every event or individual in a population. Instead, they typically collect data on a sample of a population and use these data to draw conclusions (or make inferences) about the entire population. Statistics is a mathematical discipline that relates to this type of analysis.

One of your first steps in analyzing a small data set is to graph the data and examine the distribution. Here are two graphs of beak measurements taken from two samples of medium ground finches that lived on the island of Daphne Major, one of the Galápagos Islands, during a major drought in 1977.

The measurements tend to be more or less symmetrically distributed across a range, with most measurements around the center of the distribution. This is a characteristic of a normal distribution.

Measures of Average: Mean, Median, and Mode A description of a group of observations can include a value for the mean, median, or mode. These are all measures of central tendency—in other words, they represent a number close to the center of the distribution.

The Mean You calculate the mean (also referred to as the average or arithmetic mean) by summing all the data points in a data set (ΣX) and then dividing this number by the total number of data points (N): What scientists want to understand is the mean of the entire population, which is represented by µ. They use the sample mean, represented by ̅, as an estimate of µ.

Calculate the mean for two subsets of the Grants’ Data Nonsurvivors Survivors 5-bird sample15-bird sample5-bird sample15-bird sample Bird ID # Beak Depth (mm) Bird ID # Beak Depth (mm) Bird ID # Beak Depth (mm) Bird ID # Beak Depth (mm) Mean

Note that the mean values are different for the five and fifteen-samples. Which is a better estimate of the true mean, µ?

Median When the data are ordered from the largest to the smallest, the median is the midpoint of the data. It is not distorted by extreme values, or even when the distribution is not normal. For this reason, it may be more useful for you to use the median as the main descriptive statistic for a sample of data in which some of the measurements are extremely large or extremely small. Find the median for each of your four sets of finch data.

Range The simplest measure of variability in a sample of normally distributed data is the range, which is the difference between the largest and smallest values in a set of data. You can use the range for data that are not normally distributed. For any data, a larger range value indicates a greater spread of the data—in other words, the larger the range, the greater the variability. An extremely large or small value in the data set will make the variability appear high. Calculate the range for your four samples of the Grants’ data. The standard deviation provides a more reliable measure of the “true” spread of the data.

Definitions for Median and Range on BRT

Standard Deviation and Variance The standard deviation is the most widely used measure of variability. The sample standard deviation (s) is essentially the average of the deviation between each measurement in the sample and the sample mean (). The sample standard deviation is an estimate of the standard deviation in the larger population.

The formula for calculating the sample standard deviation follows:

What does standard deviation indicate? If a population has a normal distribution, 68% of the sample should be within one standard deviation of the mean. Approximately 2 standard deviations should account for 95% of all samples.

Variance and Standard Deviation Note that the number calculated at this step provides a statistic called variance (s 2 ). Variance is an important measure of variability that is used in certain statistical methods. It is the square of the standard variation. 6. Take the square root to calculate the standard deviation (s) for the sample. Calculate the standard deviation for the two five-bird samples (survivors and non-survivors)

Results Note that the standard deviation is smaller for the larger samples. This is often, but not always the case.. Now let’s look at a larger data set:

Data for 50 Finches Four variables, including beak depth Calculate the mean for each variable Variance is given; you can calculate the standard deviation easily Enter in table

Fill in everything except 95% confidence interval

Measures of Confidence: Standard Error of the Mean and 95% Confidence Interval The standard deviation provides a measure of the spread of the data from the mean. A different type of statistic reveals the uncertainty in the calculation of the mean. The sample mean is not necessarily identical to the mean of the entire population. Every time you take a sample and calculate a sample mean, you would expect a slightly different value. In other words, the sample means themselves have variability.

Standard Error of the Mean This variability can be expressed by calculating the standard error of the mean (abbreviated as̅ SEM) or the 95% confidence interval (95% CI).

Why are SEM and 95% CL useful? The standard error of the mean represents the standard deviation of a distribution around the mean and estimates how close the sample mean is to the population mean. The greater the sample size (i.e., 50 rather than 15 or 5 finches), the more closely the sample mean will estimate the population mean, and therefore the standard error of the mean becomes smaller. The 95% confidence interval (95% CI) is equivalent to 1.96 (typically rounded to 2) standard errors of the mean. Because the sample means are assumed to be normally distributed, 95% of all sample means should fall between 2 standard deviations above and below the population mean, estimated by 95% CI.

Calculate the 95% Confidence Interval for each variable mean. Both SEM̅ and 95% CI can be illustrated as error bars in a bar graph of the means of two or more samples that are being compared. Depicting SEM or the 95% CI as error bars in a bar graph provides a clear visual clue to the uncertainty of the calculations of the sample means.

Make a Bar Graph with Error Bars On a sheet of graph paper, construct four bar graphs that compare the means of non-survivors and survivors for each physical characteristic (wing length, body mass, tarsus length, and beak size). Label both axes of each graph and show the 95% CI as error bars. Once you complete your four bar graphs, describe any differences between non-survivors and survivors you observe in each graph.

What have we learned? Measurements often cluster around the mean is a form known as a normal distribution. The average variability about the mean is known as the standard deviation of the mean. Since it usually not possible to sample every individual in a population, there is also variability in the measurement of the mean. If the variation of the mean follows a normal distribution, we can estimate how close our sample mean is to the actual mean. 95% confidence intervals allow us to compare sample means of different groups and infer statistically significant differences.

Observation: seeds of weeds seem to need light as well as water to germinate You will investigate this observation using garden seeds of various species You have the following: lots of seeds, petri dishes, filter paper, water Design an experiment to address the observation. Each petri dish should have exactly 30 seeds; each group should use at least 10 petri dishes We will collect germination data and analyze it next week. A written report, in scientific format, will be completed for homework in your lab notebook…

Observation: seeds of weeds seem to need light as well as water to germinate You will investigate this observation using dill weed (Anethum graveolens) You have the following: lots of dill seeds, petri dishes, filter paper, water Design an experiment to address the observation. Each petri dish should have exactly 30 seeds; each group should use at least 6 petri dishes We will collect germination data and analyze it next week. A written report, in scientific format, will be completed for homework…