Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________.

Slides:



Advertisements
Similar presentations
Using the T-test IB Biology Topic 1.
Advertisements

Introductory Mathematics & Statistics for Business
Are our results reliable enough to support a conclusion?
Review Feb Adapted from: Taylor, S. (2009). Statistical Analysis. Taken from:
Psychology: A Modular Approach to Mind and Behavior, Tenth Edition, Dennis Coon Appendix Appendix: Behavioral Statistics.
Table of Contents Exit Appendix Behavioral Statistics.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
Sampling Distributions
Estimating  When  Is Unknown
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Hypothesis Testing:.
DATA ANALYSIS FOR RESEARCH PROJECTS
TOPIC 1 STATISTICAL ANALYSIS
Statistical Analysis Statistical Analysis
t-Test: Statistical Analysis
Topic 1: Statistical Analysis
Topic 6.1 Statistical Analysis. Lesson 1: Mean and Range.
Copyright © Allyn & Bacon 2007 Chapter 2: Research Methods.
Comparing Means: t-tests Wednesday 22 February 2012/ Thursday 23 February 2012.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
6.1 Statistical Analysis.
Measures of Dispersion CUMULATIVE FREQUENCIES INTER-QUARTILE RANGE RANGE MEAN DEVIATION VARIANCE and STANDARD DEVIATION STATISTICS: DESCRIBING VARIABILITY.
Statistical Analysis Topic – Math skills requirements.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Statistical Analysis IB Topic 1. Why study statistics?  Scientists use the scientific method when designing experiments  Observations and experiments.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
1.1 Statistical Analysis. Learning Goals: Basic Statistics Data is best demonstrated visually in a graph form with clearly labeled axes and a concise.
Statistical analysis. Types of Analysis Mean Range Standard Deviation Error Bars.
Psych 230 Psychological Measurement and Statistics
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
Statistical Analysis. Null hypothesis: observed differences are due to chance (no causal relationship) Ex. If light intensity increases, then the rate.
PCB 3043L - General Ecology Data Analysis. PCB 3043L - General Ecology Data Analysis.
Data Analysis.
PCB 3043L - General Ecology Data Analysis.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
Are our results reliable enough to support a conclusion?
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
The T-Test Are our results reliable enough to support a conclusion?
Descriptive Statistics Used in Biology. It is rarely practical for scientists to measure every event or individual in a population. Instead, they typically.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Introductory Psychology: Statistical Analysis
Using the T-test Topic 1.
Statistical analysis.
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
Chapter 16: Sample Size “See what kind of love the Father has given to us, that we should be called children of God; and so we are. The reason why the.
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
TOPIC 1: STATISTICAL ANALYSIS
Are our results reliable enough to support a conclusion?
Statistical Analysis IB Topic 1.
Are our results reliable enough to support a conclusion?
Are our results reliable enough to support a conclusion?
Are our results reliable enough to support a conclusion?
Statistical analysis.
Practice As part of a program to reducing smoking, a national organization ran an advertising campaign to convince people to quit or reduce their smoking.
8.3 Estimating a Population Mean
Are our results reliable enough to support a conclusion?
Are our results reliable enough to support a conclusion?
Samples and Populations
Presentation transcript:

Answering questions about life with statistics ! The results of many investigations in biology are collected as numbers known as _____________________ data. These numbers can often be better interpreted using statistics such as; ______________________ & _______________________. You must use statistical analyses for your Data Collection and Processing sections of the Lab reports – simply graphing data is not considered “processing” in IB standards.

Example: In an investigation of the heights of the blades of grass in a field, you measure blades of grass with a ruler. The result are a _____________________________________________. These values must be handled appropriately to give us useful information. The values have ______________ (e.g. the height of a particular blade of grass is 25 __?_______). The values have _______________________. (E.g. measuring with a ruler might only be precise to ±1 mm, so the height of a particular blade might be 25 ±1 mm. In this case, it could have been ____________ mm, or _________ mm, or any value in between these, but not more or less than these). We also express the precision of our values in the number of decimal places that we choose. For example, stating that a blade of grass is 25 mm long is not the same as stating that it is 25.0 mm long. In the first case, it is implicit that the blade could be between ___________ and ____________ mm long. In the second case, the blade could be between _____________ and _______________mm long.

In collecting data, we should: · ______________________________________________________________ · _______________________________________________________________ · ______________________________________________________________ Biologist usually take a sample Sometimes, it is possible to measure all of the things that are being considered. For example, the heights of all the oak trees in a very small forest. In statistics, all the values that could be considered is called the __________________ In most investigations, it is not possible, not practical, or not advisable to measure all the values in a population. In these situations, we measure just some of the values, known as a _____________________________

Example. The heights of each of the blades of grass in the two fields differ between each other, even inside the same field. We could measure the height of one blade of grass from each of the two fields (a very small sample) and find that the blade of grass from the field with high potassium is longer than the blade of grass from the field with low potassium. However, we would still be unsure of whether the difference between the heights of these two blades is due to the field it came from, or was just a difference that occurs anyway within each field. We could measure the heights of 500 blades of grass in each of the two fields (a larger sample). It is difficult for the human mind to obtain useful information from such a large amount of unprocessed data. However, by using statistics, we can describe the values in various ways that make the information more meaningful for us. Most often, we process the data to estimate an average and to describe the variation in some way. Populations and samples show variation In a population, we usually find that not all the values are identical. Instead, there are differences between the values even inside a population. We call this___________________. The mean:_____________________________________________________. We estimate the mean as follows:

Example. We have measured the heights of 500 blades of grass from each field. The mean height of the grass in the sample from the field with high potassium is 56.2 mm and the mean height of the grass in the sample from the field with low potassium is 48.5 mm. Does the data prove the difference between the means of our samples really represents a difference between the two populations or is this possibly due to random variation in the population? ______________________________________ ___________________________________________________________________ To help decide this we need to examine: The range: __________________________________________________ Range = ____________________________________________________________ The standard deviation: ________________________________________ _____________________________________________________________ Sx = ________________________________________________

Error bars In many charts and graphs, we show the mean values of our samples. Such a chart or graph can clearly show the differences between our conditions and trends may become apparent. Consider the examples shown on page 3 in the Heinemann textbook. What trends and relationships do these show? An error bar is _________________________________________________________ _______________________________________________________________________

Comparison of graphs: Note that the Standard deviation graph has removed the extremes of variation called ____________________________ The range graph with its extreme values is perhaps misleads us to think the data maybe similar.

Example. We have measured the lengths of 969 blades of grass in a field with a medium level of soil phosphorus. We have then grouped the data into a table, which shows how many blades of grass had a particular length. We call this kind of table a frequency distribution table. Length of blade of grass (mm) Number of blades of grass 0 – – – – – – – – – – – The normal distribution Very often in biology, the variation that we find in our samples follows a so-called normal distribution (sometimes also called a bell-curve). Most, ______________% of the values are quite close to the mean, rather fewer are somewhat greater or somewhat less than the mean, and just a few are much greater or much less than the mean ________________%.  Calculate the mean, mode (value which appears most frequently), median (value at which ½ the data are greater and ½ are lesser), n (the sample number) and the range of the data below. Graph the data on a separate piece of graph paper.

Some special characteristics of the normal distribution · _______________________________________________________________ · ____________________________________________________________________ · _____________________________________________________________________ · __________________________________________________________________ · ____________________________________________________________________

Calculating Standard Deviation with the formula and with the Ti-84 graphic calculator! Calculate the Sx for the following data points: X= 2, 3, 4 – no calculator needed. Let’s calculate the Sx for the heights in our class in centimeters. Hint: 1 inch = 2.54cm using the Ti-83/84.  STAT, edit ENTER, L1 –enter the data here hitting ENTER btwn each value, STAT, move to CALC, 1-Var Stats, ENTER ….Ta Da! 11click4biology.info/c4b/1/stat1.htmSlide

Example. In an investigation of the two fields, we found that the mean height of the blades of grass in the sample from the field with high potassium was 56.2 mm and the mean height of the blades of grass from the field with the low potassium was 48.5 mm. We have found thus a difference between the means of our two samples. However, we need to determine if this difference is significant. That is, does it indicate a true difference in the means of the populations of the two fields. Using the standard deviation to indicate possible significance A ___________ difference between the means of samples, and __________ standard deviations for these samples, indicates that it is likely that the difference between the means is statistically significant. A ___________ difference between the means of samples, and_________ standard deviations for these samples, indicates that it is likely that the difference between these means is not statistically significant.

Confidence levels It is seldom possible to say with absolute certainty that the difference between sample means is significant with complete certainty (100 % confidence). Most often in biology, we decide that we want to be ________% confident that the difference between the samples is significant. This means that there is only a _______% chance that the samples could be as different as they are because of chance, and not because of a real difference between the populations. We could also say that we are confident that the probability (p) that chance alone produced the difference between our sample means is 5 % (p = 0.05).  Standard deviation is: ______________________________________________ __________________________________________________________________

To Calculate Sx with your Ti-83/84, go to the following website for step by step directions. Eventually, you must be able to perform this function. The t-test The t-test determines whether the difference observed between the Sx of two samples is significant, at a chosen confidence level (0.05).  Briefly, the test works as follows: Briefly, the test works as follows: · A value of t is calculated with your Ti 83/84 from the data. · If the calculated value for t is _________________the required value for t, the difference between the means is _______________________ at this confidence level (0.05) · If the calculated value for t is smaller than the required value for t, the difference between the means is _________________________at this confidence level.

It is VERY unlikely that the mean height of our two samples will be exactly the same C1 Sample Average height = 162 cm D8 Sample Average height = 168 cm Is the difference in average height of the samples large enough to be significant?

We can analyse the spread of the heights of the students in the samples by drawing histograms Here, the ranges of the two samples have a small overlap, so… … the difference between the means of the two samples IS probably significant Frequency Height (cm) C1 Sample Frequency Height (cm) D8 Sample

Here, the ranges of the two samples have a large overlap, so… … the difference between the two samples may NOT be significant. The difference in means is possibly due to random sampling error Frequency Height (cm) C1 Sample Frequency Height (cm) D8 Sample

To decide if there is a significant difference between two samples we must compare the mean height for each sample… … and the spread of heights in each sample. Statisticians calculate the standard deviation of a sample as a measure of the spread of a sample S x = Σx 2 - (Σx) 2 n n - 1 Where: Sx is the standard deviation of sample Σ stands for ‘sum of’ x stands for the individual measurements in the sample n is the number of individuals in the sample You can calculate standard deviation using the formula:

Student’s t-test The Student’s t-test compares the averages and standard deviations of two samples to see if there is a significant difference between them. We start by calculating a number, t t can be calculated using the equation: ( x 1 – x 2 ) (s 1 ) 2 n1n1 (s 2 ) 2 n2n2 + t = Where: x 1 is the mean of sample 1 s 1 is the standard deviation of sample 1 n 1 is the number of individuals in sample 1 x 2 is the mean of sample 2 s 2 is the standard deviation of sample 2 n 2 is the number of individuals in sample 2

Worked Example: Random samples were taken of pupils in C1 and D8 Their recorded heights are shown below… Students in C1Students in D8 Student Height (cm) Step 1: Add the data to List 1 & List 2 in your Ti- 83+ STAT, EDIT, ENTER: put data into L1 & L2

Step 3: Calculate the Degrees of Freedom Step 2: Use the 2 variable t-Test to calculate the difference of the Standard deviation between the 2 sets of data. STAT, move to TESTS, choose 2-sampT TEST (#4), Scroll down to CALCULATE, ENTER. Degree Freedom = n 1 + n 2 – = 28 D.F.

Step 4: Find the critical value of t for the relevant number of degrees of freedom using the Biology T-Table (page 7). Use the 95% (p=0.05) confidence limit (or critical value). Critical value = Our calculated value of t is below the critical value for 28d.f., therefore, there is no significant difference between the height of students in samples from C1 and D8

Example. The number of hours a week that a group of students spent training was compared with the fastest speed that they could run. A trend was found. With increasing time spent training, the faster the student could run. Example. The number of cigarettes that a group of students smoked each week was compared with the speed that they could run. A trend was found. With increasing number of cigarettes smoked each week, the slower the student could run. Example. The speed at which a group of students could solve mathematical tasks was compared with the speed that they could run. No trend was found. With increasing speed in solving mathematical tasks, no consistent change was found in the speed at which the students could run. If an increase in one factor is associated with an increase in another factor, we say that there is a _____________________correlation. If an increase in one factor is associated with a decrease in another factor, we say that there is a ________________________ correlation. If an increase in a factor is not associated with a consistent change in another factor, we say that there is _____________ correlation between them. Correlation & Causation

* Correlation does NOT necessarily indicate causation. * Why not? __________________________________________________ * ______________________________________________________________________ For Example: _____________________________________________ __________________________________________________________