Statistical Analysis Topic 1. Statistics 1.1.1 State that error bars are a graphical representation of the variability of data. 1.1.2 Calculate the mean.

Slides:



Advertisements
Similar presentations
Statistical Analysis WHY ?.
Advertisements

Statistical Tests Karen H. Hagglund, M.S.
QUANTITATIVE DATA ANALYSIS
Calculating & Reporting Healthcare Statistics
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
AP Biology Intro to Statistic
1 STATISTICS!!! The science of data. 2 What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis.
DATA ANALYSIS FOR RESEARCH PROJECTS
@ 2012 Wadsworth, Cengage Learning Chapter 5 Description of Behavior Through Numerical 2012 Wadsworth, Cengage Learning.
TOPIC 1 STATISTICAL ANALYSIS
Assessment Statements – State that error bars are a graphical representation of the variability of data – Calculate the mean and standard deviation.
STATISTICS For Research. Why Statistics? 1. Quantitatively describe and summarize data A Researcher Can:
Data Collection & Processing Hand Grip Strength P textbook.
Topic 1: Statistical Analysis
Topic 6.1 Statistical Analysis. Lesson 1: Mean and Range.
STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.
Statistical Analysis.  State that Error bars are a graphical representation of the variability of data.  To answer an IB question involving simply.
Statistics The POWER of Data. Statistics: Definition Statistics is the mathematics of the collection, organization, and interpretation of numerical data.
Psychology’s Statistics Statistical Methods. Statistics  The overall purpose of statistics is to make to organize and make data more meaningful.  Ex.
STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.
Statistical Analysis Mean, Standard deviation, Standard deviation of the sample means, t-test.
6.1 Statistical Analysis.
Nature of Science Science Nature of Science Scientific methods Formulation of a hypothesis Formulation of a hypothesis Survey literature/Archives.
Statistical Analysis Topic – Math skills requirements.
Statistical Analysis Topic 1. Statistics State that error bars are a graphical representation of the variability of data Calculate the mean.
MATH IN THE FORM OF STATISTICS IS VERY COMMON IN AP BIOLOGY YOU WILL NEED TO BE ABLE TO CALCULATE USING THE FORMULA OR INTERPRET THE MEANING OF THE RESULTS.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
QUANTITATIVE RESEARCH AND BASIC STATISTICS. TODAYS AGENDA Progress, challenges and support needed Response to TAP Check-in, Warm-up responses and TAP.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
 Statistics The Baaaasics. “For most biologists, statistics is just a useful tool, like a microscope, and knowing the detailed mathematical basis of.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
STATISTICS!!! The science of data.
Statistical Analysis IB Topic 1. Why study statistics?  Scientists use the scientific method when designing experiments  Observations and experiments.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Statistical analysis. Types of Analysis Mean Range Standard Deviation Error Bars.
Descriptive & Inferential Statistics Adopted from ;Merryellen Towey Schulz, Ph.D. College of Saint Mary EDU 496.
STATISTICS!!! The science of data. What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Chapter Eight: Using Statistics to Answer Questions.
RESEARCH & DATA ANALYSIS
PCB 3043L - General Ecology Data Analysis.
STATISTICS FOR SCIENCE RESEARCH (The Basics). Why Stats? Scientists analyze data collected in an experiment to look for patterns or relationships among.
Statistical analysis Why?? (besides making your life difficult …)  Scientists must collect data AND analyze it  Does your data support your hypothesis?
MAKING MEANING OUT OF DATA Statistics for IB-SL Biology.
USING GRAPHING SKILLS. Axis While drawing graphs, we have two axis. X-axis: for consistent variables Y-axis: for other variable.
STATISICAL ANALYSIS HLIB BIOLOGY TOPIC 1:. Why statistics? __________________ “Statistics refers to methods and rules for organizing and interpreting.
Statistical Analysis IB Topic 1. IB assessment statements:  By the end of this topic, I can …: 1. State that error bars are a graphical representation.
PCB 3043L - General Ecology Data Analysis Organizing an ecological study What is the aim of the study? What is the main question being asked? What are.
AP PSYCHOLOGY: UNIT I Introductory Psychology: Statistical Analysis The use of mathematics to organize, summarize and interpret numerical data.
Statistical analysis.
AP Biology Intro to Statistics
Statistical analysis.
STATISTICS For Research
STATISTICS!!! The science of data.
AP Biology Intro to Statistics
Statistics in Science Data can be collected about a population (surveys) Data can be collected about a process (experimentation)
Statistics for IB-SL Biology
Statistical Analysis Error Bars
AP Biology Intro to Statistic
What is Data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing.
AP Biology Intro to Statistic
Statistical Analysis IB Topic 1.
AP Biology Intro to Statistic
STATISTICS Topic 1 IB Biology Miss Werba.
STATISTICAL ANALYSIS.
Chapter Nine: Using Statistics to Answer Questions
Data Literacy Graphing and Statisitics
Presentation transcript:

Statistical Analysis Topic 1

Statistics State that error bars are a graphical representation of the variability of data Calculate the mean and standard deviation of a set of values State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of values fall within one standard deviation of the mean State that error bars are a graphical representation of the variability of data Calculate the mean and standard deviation of a set of values State that the term standard deviation is used to summarize the spread of values around the mean, and that 68% of values fall within one standard deviation of the mean.

1.1.4 Explain how the standard deviation is useful for comparing the means and spread of data between two or more samples Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables Explain that the existence of a correlation does not establish that there is a causal relationship between two variables Explain how the standard deviation is useful for comparing the means and spread of data between two or more samples Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.

What is data? Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing conclusions Encarta dictionary Information, in the form of facts or figures obtained from experiments or surveys, used as a basis for making calculations or drawing conclusions Encarta dictionary

2 types of Data Qualitative Quantitative Qualitative Quantitative

Statistics in Science Data can be collected about a population (surveys) Data can be collected about a process (experimentation) Data can be collected about a population (surveys) Data can be collected about a process (experimentation)

Qualitative Data Information that relates to characteristics or description (observable qualities) Information is often grouped by descriptive category Examples Species of plant Type of insect Shades of color Rank of flavor in taste testing Remember: qualitative data can be “scored” and evaluated numerically Information that relates to characteristics or description (observable qualities) Information is often grouped by descriptive category Examples Species of plant Type of insect Shades of color Rank of flavor in taste testing Remember: qualitative data can be “scored” and evaluated numerically

Qualitative data, manipulated numerically Survey results, teens and need for environmental action

Quantitative data Quantitative – measured using a naturally occurring numerical scale Examples Chemical concentration Temperature Length Weight…etc. Quantitative – measured using a naturally occurring numerical scale Examples Chemical concentration Temperature Length Weight…etc.

Quantitation Measurements are often displayed graphically

Quantitation = Measurement In data collection for Biology, data must be measured carefully, using laboratory equipment ( ex. Timers, metersticks, pH meters, balances, pipettes, etc) The limits of the equipment used add some uncertainty to the data collected. All equipment has a certain magnitude of uncertainty. For example, is a ruler that is mass-produced a good measure of 1 cm? 1mm? 0.1mm? For quantitative testing, you must indicate the level of uncertainty of the tool that you are using for measurement!! In data collection for Biology, data must be measured carefully, using laboratory equipment ( ex. Timers, metersticks, pH meters, balances, pipettes, etc) The limits of the equipment used add some uncertainty to the data collected. All equipment has a certain magnitude of uncertainty. For example, is a ruler that is mass-produced a good measure of 1 cm? 1mm? 0.1mm? For quantitative testing, you must indicate the level of uncertainty of the tool that you are using for measurement!!

Finding the level of uncertainty As a “rule-of-thumb”, if not specified, use +/- 1/2 of the smallest measurement unit (ex metric ruler is lined to 1mm,so the limit of uncertainty of the ruler is +/- 0.5 mm.) If the room temperature is read as 25 degrees C, with a thermometer that is scored at 1 degree intervals – what is the range of possible temperatures for the room? (ans.s +/- 0.5 degrees Celsius - if you read 15 o C, it may in fact be 14.5 or 15.5 degrees) As a “rule-of-thumb”, if not specified, use +/- 1/2 of the smallest measurement unit (ex metric ruler is lined to 1mm,so the limit of uncertainty of the ruler is +/- 0.5 mm.) If the room temperature is read as 25 degrees C, with a thermometer that is scored at 1 degree intervals – what is the range of possible temperatures for the room? (ans.s +/- 0.5 degrees Celsius - if you read 15 o C, it may in fact be 14.5 or 15.5 degrees)

Definition of statistics Branch of mathematics which allows us to sample small portions from habitats, communities, or biological populations, and draw conclusions about the larger population. Statistics measure the differences and relationships between sets of data Nothing is 100% certain in science Branch of mathematics which allows us to sample small portions from habitats, communities, or biological populations, and draw conclusions about the larger population. Statistics measure the differences and relationships between sets of data Nothing is 100% certain in science

Mean An average of data points Central tendency of the data Find the mean of the given data³: Answer: An average of data points Central tendency of the data Find the mean of the given data³: Answer: Country# of reported HIV cases Argentina27517 Bahamas4548 Canada19468 Dominican Republic 7167 Ecuador6297

Range A measure of the spread of data Difference between the largest and the smallest observed values Find the range of the given data: Answer: If one data point were unusually large or unusually small, it would have a great effect on the range. Such points are called outliers. A measure of the spread of data Difference between the largest and the smallest observed values Find the range of the given data: Answer: If one data point were unusually large or unusually small, it would have a great effect on the range. Such points are called outliers. Country# of reported HIV cases Argentina27517 Bahamas4548 Canada19468 Dominican Republic 7167 Ecuador6297

Looking at Data How accurate is the data? (How close are the data to the “real” results?) This is also considered as BIAS How precise is the data? (All test systems have some uncertainty, due to limits of measurement) Estimation of the limits of the experimental uncertainty is essential. How accurate is the data? (How close are the data to the “real” results?) This is also considered as BIAS How precise is the data? (All test systems have some uncertainty, due to limits of measurement) Estimation of the limits of the experimental uncertainty is essential.

Comparing Averages Once the 2 averages are calculated for each set of data, the average values can be plotted together on a graph, to visualize the relationship between the 2

Drawing error bars The simplest way to draw an error bar is to use the mean as the central point, and to use the distance of the measurement that is furthest from the average as the endpoints of the data bar

Average value Value farthest from average Calculated distance

What do error bars suggest? If the bars show extensive overlap, it is likely that there is not a significant difference between those values

Error bars Graphical representation of the variability of data Can be used to show either the range of data or the standard deviation on a graph Graphical representation of the variability of data Can be used to show either the range of data or the standard deviation on a graph

Standard deviation A measure of how the individual observations of a data set are dispersed or spread out around the mean. Determined by a mathematical formula which is programmed into your calculator In a normal distribution, about 68% of all values lie within ±1 standard deviation of the mean. This rises to about 95% for ±2 standard deviations from the mean. A measure of how the individual observations of a data set are dispersed or spread out around the mean. Determined by a mathematical formula which is programmed into your calculator In a normal distribution, about 68% of all values lie within ±1 standard deviation of the mean. This rises to about 95% for ±2 standard deviations from the mean.

How is Standard Deviation calculated? With this formula!

How to calculate SD stat/StatTI-86.html stat/StatTI-86.html TI-86 stat/StatTI-86.html stat/StatTI-86.html stat/StatTI-83.html stat/StatTI-83.html TI-83 and 84 stat/StatTI-83.html stat/StatTI-83.html In Microsoft Excel, type the following code into the cell where you want the Standard Deviation result, using the "unbiased," or "n-1" method: =STDEV(A1:A30) (substitute the cell name of the first value in your dataset for A1, and the cell name of the last value for A30.) stat/StatTI-86.html stat/StatTI-86.html TI-86 stat/StatTI-86.html stat/StatTI-86.html stat/StatTI-83.html stat/StatTI-83.html TI-83 and 84 stat/StatTI-83.html stat/StatTI-83.html In Microsoft Excel, type the following code into the cell where you want the Standard Deviation result, using the "unbiased," or "n-1" method: =STDEV(A1:A30) (substitute the cell name of the first value in your dataset for A1, and the cell name of the last value for A30.)

Comparing the means and standard deviation between two or more samples Height of bean plants in the sunlight in centimetres ±0.1 cm Height of bean plants in the shade in centimetres ±0.1 cm Total 1300 Mean: 1300/10 = cm

Answers SD for sunlight data: cm SD for shade data: cm Wide variation makes us question experimental design Means alone is not sufficient SD for sunlight data: cm SD for shade data: cm Wide variation makes us question experimental design Means alone is not sufficient

A typical standard distribution curve

According to this curve: One standard deviation away from the mean in either direction on the horizontal axis (the red area on the preceding graph) accounts for somewhere around 68 percent of the data in this group. Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the data. One standard deviation away from the mean in either direction on the horizontal axis (the red area on the preceding graph) accounts for somewhere around 68 percent of the data in this group. Two standard deviations away from the mean (the red and green areas) account for roughly 95 percent of the data.

Three Standard Deviations? three standard deviations (the red, green and blue areas) account for about 99 percent of the data -3sd -2sd +/-1sd 2sd +3sd

NRT Example 100 tests taken Grades plotted on a graph Graph likely to be a bell curve When data points are clustered together, the standard deviation is small; when they are spread apart, the standard deviation is large 100 tests taken Grades plotted on a graph Graph likely to be a bell curve When data points are clustered together, the standard deviation is small; when they are spread apart, the standard deviation is large

How is SD useful? Many extremes = large SD Few extremes = small SD Many extremes = large SD Few extremes = small SD

Coefficient of Variation(V) Ratio of the standard deviation to the mean expressed as a percentage V = (100 X SD)/Mean Gives the similar information about the data as the SD, but some people might find percentages easier to understand From Stats for IB Sports Medicine Ratio of the standard deviation to the mean expressed as a percentage V = (100 X SD)/Mean Gives the similar information about the data as the SD, but some people might find percentages easier to understand From Stats for IB Sports Medicine

Coefficient of Variation Example: Comparing oxygen uptake data between individuals at rest and after 20 minutes of exercise for 12 participants and 24 measurements taken After rest: Mean = ± 35.66, V= 9.31% Exercise: Mean = ± 23.42, V= 5.82% T=1.194, p=0.21 Example: Comparing oxygen uptake data between individuals at rest and after 20 minutes of exercise for 12 participants and 24 measurements taken After rest: Mean = ± 35.66, V= 9.31% Exercise: Mean = ± 23.42, V= 5.82% T=1.194, p=0.21

Significant difference between two data sets using the t-test T-test compares two sets of data to see if chance alone could make a difference Scientists like to be at least 95% certain of their findings before drawing conclusions Mean, SD, and sample size are used to calculate the value of t Degrees of freedom = sum of sample sizes of each of the two groups minus 2 T-test compares two sets of data to see if chance alone could make a difference Scientists like to be at least 95% certain of their findings before drawing conclusions Mean, SD, and sample size are used to calculate the value of t Degrees of freedom = sum of sample sizes of each of the two groups minus 2

T-test calculation est1.cfm est1.cfm For all data values: est1.cfm est1.cfm ources/calculators/ttest.html ources/calculators/ttest.html For means: ources/calculators/ttest.html ources/calculators/ttest.html est1.cfm est1.cfm For all data values: est1.cfm est1.cfm ources/calculators/ttest.html ources/calculators/ttest.html For means: ources/calculators/ttest.html ources/calculators/ttest.html

Worked example Compare two groups of barnacles living on a rocky shore. Measure the width of their shells to see if a significant size difference is found depending on how close they live to the water. One group lives between 0 and 10 metres from the water level. The second group lives between 10 and 20 metres above the water level.

Measurement was taken of the width of the shells in millimetres. 15 shells were measured from each group. The mean of the group closer to the water indicates that living closer to the water causes the barnacles to have a larger shell. If the value of t is 2.25, is that a significant difference?

Steps to determining significant difference when given value of t Determine degree of freedom (# in each set minus 2) Ex – 2 = 28 Use given value of t Ex Use table of t values to determine probability (p) of chance Ex or 5% The confidence level is 95% Ex. We are 95% confident that the difference between barnacles is significant. Barnacles living nearer the water have a significantly larger shell than those living 10 metres or more away from the water. Determine degree of freedom (# in each set minus 2) Ex – 2 = 28 Use given value of t Ex Use table of t values to determine probability (p) of chance Ex or 5% The confidence level is 95% Ex. We are 95% confident that the difference between barnacles is significant. Barnacles living nearer the water have a significantly larger shell than those living 10 metres or more away from the water.

T table One-tailed t-test– if your hypothesis is that one mean is either larger or smaller than the other Two-tailed t-test – if your hypothesis is that the two means are not equal (not specifying larger or smaller) One-tailed t-test– if your hypothesis is that one mean is either larger or smaller than the other Two-tailed t-test – if your hypothesis is that the two means are not equal (not specifying larger or smaller)

Website help cfm cfm cfm cfm

Correlation does not mean causation Experiments provide a test which shows cause Observations without an experiment can only show a correlation Experiments provide a test which shows cause Observations without an experiment can only show a correlation

Correlation test Correlation signified by value of r +1 (completely positive correlation) 0 (no correlation) -1 (completely negative correlation) /strand4/scatterplot.htm /strand4/scatterplot.htm /strand4/scatterplot.htm /strand4/scatterplot.htm Note that r describes linear relationships Note that r describes linear relationships Correlation signified by value of r +1 (completely positive correlation) 0 (no correlation) -1 (completely negative correlation) /strand4/scatterplot.htm /strand4/scatterplot.htm /strand4/scatterplot.htm /strand4/scatterplot.htm Note that r describes linear relationships Note that r describes linear relationships

Correlation or causation? 1. Cars with low gas mileage per gallon of fuel cause global warming. 2. Drinking red wine protects against heart disease. 3. Tanning beds can cause skin cancer. 4. UV rays increase the risk of cataracts. 5. Vitamin C cures the common cold. 1. Cars with low gas mileage per gallon of fuel cause global warming. 2. Drinking red wine protects against heart disease. 3. Tanning beds can cause skin cancer. 4. UV rays increase the risk of cataracts. 5. Vitamin C cures the common cold.

Resources ¹ ated/Facts.asp#src1 ² ated/Consumption.asp ³ eFiles/generalIncludeFiles/listInstances. asp Stephe Taylor Bandung international school ¹ ated/Facts.asp#src1 ² ated/Consumption.asp ³ eFiles/generalIncludeFiles/listInstances. asp Stephe Taylor Bandung international school