Quantitative Skills: Data Analysis and Graphing.

Slides:



Advertisements
Similar presentations
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Advertisements

Introduction to Summary Statistics
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #18.
Introduction to Educational Statistics
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Edpsy 511 Homework 1: Due 2/6.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Standard Error for AP Biology
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
AP STATISTICS Section 1.1: Displaying Distributions.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Programming in R Describing Univariate and Multivariate data.
Chapter 1 Descriptive Analysis. Statistics – Making sense out of data. Gives verifiable evidence to support the answer to a question. 4 Major Parts 1.Collecting.
Objective To understand measures of central tendency and use them to analyze data.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Census A survey to collect data on the entire population.   Data The facts and figures collected, analyzed, and summarized for presentation and.
Class Meeting #11 Data Analysis. Types of Statistics Descriptive Statistics used to describe things, frequently groups of people.  Central Tendency 
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Methods for Describing Sets of Data
Quantitative Skills: Data Analysis
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Smith/Davis (c) 2005 Prentice Hall Chapter Four Basic Statistical Concepts, Frequency Tables, Graphs, Frequency Distributions, and Measures of Central.
Quantitative Skills 1: Graphing
Statistical Tools in Evaluation Part I. Statistical Tools in Evaluation What are statistics? –Organization and analysis of numerical data –Methods used.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chapter 2 Describing Data.
AP Biology Science Skills. AP Biology Science has principles   Science seeks to explain the natural world and its explanations are tested using evidence.
QUANTITATIVE RESEARCH AND BASIC STATISTICS. TODAYS AGENDA Progress, challenges and support needed Response to TAP Check-in, Warm-up responses and TAP.
Measures of central tendency are statistics that express the most typical or average scores in a distribution These measures are: The Mode The Median.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
Categorical vs. Quantitative…
Unit 4 Statistical Analysis Data Representations.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
1 Descriptive Statistics 2-1 Overview 2-2 Summarizing Data with Frequency Tables 2-3 Pictures of Data 2-4 Measures of Center 2-5 Measures of Variation.
Chapter Eight: Using Statistics to Answer Questions.
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
UNIT #1 CHAPTERS BY JEREMY GREEN, ADAM PAQUETTEY, AND MATT STAUB.
Introduction to statistics I Sophia King Rm. P24 HWB
Measurements Statistics WEEK 6. Lesson Objectives Review Descriptive / Survey Level of measurements Descriptive Statistics.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
Chapter 2 Describing and Presenting a Distribution of Scores.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Slide 1 Copyright © 2004 Pearson Education, Inc.  Descriptive Statistics summarize or describe the important characteristics of a known set of population.
Exploratory Data Analysis
Data analysis is one of the first steps toward determining whether an observed pattern has validity. Data analysis also helps distinguish among multiple.
Methods for Describing Sets of Data
Measurements Statistics
Data Analysis.
Standard Error for AP Biology
Statistics.
Quantitative Skills : Graphing
AP Lab Skills Guide Data will fall into three categories:
Standard Error for AP Biology
Description of Data (Summary and Variability measures)
Psychology Statistics
How do we categorize and make sense of data?
Do English ivy leaves grown in the shade have a larger surface area than English ivy leaves grown in the sun?
Standard Error for AP Biology
Bar Chart Data Analysis First Generation Third Generation.
Honors Statistics Review Chapters 4 - 5
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
DESIGN OF EXPERIMENT (DOE)
Descriptive Statistics
Advanced Algebra Unit 1 Vocabulary
One of the first steps in data analysis is to create graphical displays of the data. Visual displays can make it easy to see patterns and can clarify how.
Presentation transcript:

Quantitative Skills: Data Analysis and Graphing.

Data analysis is one of the first steps toward determining whether an observed pattern has validity. Data analysis also helps distinguish among multiple working hypotheses. AP Biology Quantitative Skills Manual

Most of the data you will collect will fit into two categories: measurements or counts. AP Biology Quantitative Skills Manual Measurement data Count data

Most measurements are continuous, meaning there is an infinite number of potential measurements over a given range. AP Biology Quantitative Skills Manual

Count data are recordings of qualitative, or discrete, data. AP Biology Quantitative Skills Manual Number of leaf stomata Number of white eyed individuals

When an investigation involves measurement data, one of the first steps is to construct a histogram, or frequency diagram, to represent the data’s distribution AP Biology Quantitative Skills Manual

If the data show an approximate normal distribution on a histogram, then they are parametric data (normal). AP Biology Quantitative Skills Manual

If the data do not show an approximate normal distribution on a histogram, then they are nonparametric data. Different descriptive statistics and tests need to be applied to those data. AP Biology Quantitative Skills Manual

Sometimes, due to sampling bias, data might not fit a normal distribution even when the actual population could be normally distributed. In this case, a larger sample size might be needed. AP Biology Quantitative Skills Manual

For parametric data (a normal distribution), the appropriate descriptive statistics include : the mean (average) sample size variance standard deviation standard error AP Biology Quantitative Skills Manual

The mean (x)of the sample is the average The mean (x)of the sample is the average. The mean summarizes the entire sample and might provide an estimate of the entire population’s true mean. AP Biology Quantitative Skills Manual

The sample size (n) refers to how many members of the population are included in the study. Sample size is important when estimating how well the sample set represents the entire population. AP Biology Quantitative Skills Manual

Variance  (s2) and standard deviation (s) measure how far a data set is spread out. A variance of zero indicates that all the values in a data set are identical. AP Biology Quantitative Skills Manual Distance from the mean Variance

Because the differences from the mean are squared to calculate variance, the units of variance are not the same units as in the original data set. The standard deviation is the square root of the variance. The standard deviation is expressed in the same units as the original data set, which makes it generally more useful than the variance. AP Biology Quantitative Skills Manual

A small standard deviation indicates that the data tend to be very close to the mean. A large standard deviation indicates that the data are very spread out away from the mean. AP Biology Quantitative Skills Manual

A little more than two-thirds of the data points will fall between +1 standard deviation and −1 standard deviation from the sample mean. More than 95% of the data falls between ±2 standard deviations from the sample mean. AP Biology Quantitative Skills Manual

AP Biology Quantitative Skills Manual

68–95–99.7 Rule http://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule AP Biology Quantitative Skills Manual In a normal distribution, 68.27% of all values lie within one standard deviation of the mean. 95.45% of the values lie within two standard deviations of the mean. 99.73% of the values lie within three standard deviations of the mean.

Sample standard error (SE) is a statistic used to make an inference about how well the sample mean matches up to the true population mean. AP Biology Quantitative Skills Manual

Standard error should be represented by including error bars on graphs when appropriate. Error bars are used on graphs to indicate the uncertainty of a reported measurement.  AP Biology Quantitative Skills Manual

Different statistical tools are used in the case of data that does not resemble a normal distribution (nonparametric data, or data that is skewed or includes large outliers). median mode quartiles box-and-whisker plots AP Biology Quantitative Skills Manual

The median is the value separating the higher half of a data sample from the lower half. To find the median of a data set, first arrange the data in order from lowest to highest value and then select the value in the middle. AP Biology Quantitative Skills Manual 5, 1, 3, 7, 2 1, 2, 3, 5, 7 median

If there are two values in the middle of an ordered data set, the median is found by averaging those two values. 5, 1, 3, 7, 4, 2 1, 2, 3, 4, 5, 7 AP Biology Quantitative Skills Manual 3.5 median

The mode is the value that appears most frequently in a data set. 3, 5, 1, 3, 7, 2 3 is the mode in this example because it appears more frequently than any other number. AP Biology Quantitative Skills Manual

A bimodal distribution AP Biology Quantitative Skills Manual A bimodal distribution

Data Analysis Flowchart: Type of Data Measurement Data · Make histogram (Continuous) (normal distribution) Parametric standard deviation, standard error Mean, (not a normal distribution) Nonparametric Median, mode, quartiles (Discrete) Count Data

Example of Data Analysis: Do shady English ivy leaves have a larger surface area than sunny English ivy leaves? AP Biology Quantitative Skills Manual

Since the data collected is in centimeters, it is measurement data, not count data. So the first step is to make a: AP Biology Quantitative Skills Manual HISTOGRAM

Does the data resemble a normal curve? AP Biology Quantitative Skills Manual (Close enough, with possible differences due to sampling error)

Next, the appropriate statistical tools are applied: AP Biology Quantitative Skills Manual

A bar graph can then be produced to compare the means: AP Biology Quantitative Skills Manual

Do the error bars for the shady leaf mean overlap with the error bars for the sunny leaf mean? AP Biology Quantitative Skills Manual

A more rigorous statistical test will need to be performed, but because the error bars do not overlap there is a high probability that the two populations are indeed different from each other. AP Biology Quantitative Skills Manual

Example of Data Analysis: Is 98 Example of Data Analysis: Is 98.6°F actually the average body temperature for humans? The data are actually from a sample data set prepared by Allen Shoemaker (Shoemaker, 1996). This particular data set has been modified from the results of a study published in the Journal of American Medical Association (Mackowiak, Wasserman, and Levine, 1992).

Since the data collected is in Farenheit, it is measurement data, not count data. So the first step is to make a: AP Biology Quantitative Skills Manual HISTOGRAM

Does the data resemble a normal curve? AP Biology Quantitative Skills Manual (Close Enough)

Next, the appropriate statistical tools are applied: AP Biology Quantitative Skills Manual *Note that by convention, descriptive statistics rounds the calculated results to the same number of decimal places as the number of data points plus 1.

According to the 68–95–99.7 Rule, 68% of all samples lie within one standard deviation from the mean. This means that around 68% of the temperatures should be between 97.51 and 98.99. AP Biology Quantitative Skills Manual

Including the standard error, we can say with a 68% confidence that the mean human body temperature of our sample is 98.25 ± 0.06°F. AP Biology Quantitative Skills Manual

Categories of data: Qualitative data is not numerical and is usually subjective. Quantitative data is numerical and lends itself to statistical analysis. http://ibscrewed4maths.blogspot.com/2011/03/types-of-data.html 1.75 mL 40

Quantitative data can be either discrete or continuous. Discrete data has finite values, such as integers, or bucket categories such as “red” or “tall”. Continuous data has an infinite number of values and forms a continuum.

Which graph shows continuous data and which graph shows discrete data? Graph A Graph B

One of the first steps in data analysis is to create graphical displays of the data. Visual displays can make it easy to see patterns and can clarify how two variables affect each other. AP Biology Quantitative Skills Manual 43

Line Graphs Used when data on both scales of the graph (the x and y axes) are continuous. The dots indicate measurements that were actually made.

Basic Traits of A Good Graph 1. A Good Title A good title is one that tells exactly what information the author is trying to present with the graph. Relation Between Study Time and Score on a Biology Exam in 2011 -or- Study Time vs. Score on a Biology Exam in 2011 AP® Biology Investigative Labs: An Inquiry-Based Approach 45

Basic Traits of A Good Graph Axes should be consistently numbered. Axes should contain labels, including units. AP® Biology Investigative Labs: An Inquiry-Based Approach 46

Basic Traits of A Good Graph The independent variable is always shown on the x axis. The dependent variable is always shown on the y axis. Dependent Variable AP® Biology Investigative Labs: An Inquiry-Based Approach Independent Variable 47

Extrapolation is a prediction of what the chart might look like beyond the measured set of data. A broken line is used, indicating this a prediction and not data actually collected. 48

The slope of a line indicates the rate at which the variables being graphed are changing. y y2 – y1 m = = x x2 – x1 Rise Slope = Run

Positive Slope Negative Slope Zero Slope Rate Increasing Rate Decreasing Constant Rate Indicates some values were skipped

Line charts can be plotted with multiple data sets, allowing for better comparison. Makes use of a legend 51

Effective graphs use statistics as an essential part of the display Effective graphs use statistics as an essential part of the display. Statistics is the study of the collection, organization, analysis, interpretation and presentation of data. 

Population vs. Sample Often, researchers want to know things about a population (N), but it may not be feasible to obtain data for every member of an entire population. A sample (n) is a smaller group of members of a population selected to represent the population. The sample must be random.

Descriptive statistics and graphical displays allow us to estimate how well sample data represent the true population.

If a sample is not collected randomly, it may not closely reflect the original population. This is called sampling bias.

A normal distribution, also known as a “bell curve” or “normal curve”, can be formed with continuous data.

The type of data being collected during an investigation should be determined before performing the actual experiment. The type of data will determine the statistical analyses that can be used.

Three Types of Data: Parametric data: data that fit a normal curve Nonparametric data: data that do not fit a normal curve Frequency or count data: generated by counting

Normal or parametric data Measurement data that fit a normal curve or distribution. Data is continuous, generally in decimal form. 59

Nonparametric data Do not fit a normal distribution, may include large outliers, or may be count data that can be ordered. Can be qualitative data. http://www.e-cigarette-forum.com/forum/medical-research/254529-effect-electronic-nicotine-delivery-device-e-cigarette-smoking-reduction-cessation.html 60

Frequency or count data Generated by counting how many of an item fit into a category. Can be data that are collected as percentages. AP Biology Quantitative Skills Manual 61

Two Types of Descriptive Statistics: Comparative statistics: compare variables Association statistics: look for correlations between variables

Comparative statistics compare phenomena, events, or populations (Is A different from B?). Bar Graph or Pie Chart Bar Graph Box-and-Whisker Plot Parametric Data (normal data) Nonparametric Data Frequency Data (counts)

Association statistics look for associations between variables (How are A and B correlated?). Scatterplot AP Biology Quantitative Skills Manual Parametric Data and Nonparametric Data 64

Types of graphs commonly used with the three data types and suggested statistical tests: Source: Redrawn from “Statistics for AS Biology,” available as part of a download at: http://www.heckmondwikegrammar.net/index.php?highlight=introduction&p=10310 65

Bar Graphs Used to visually compare two samples of categorical or count data. Are also used to visually compare the calculated means with error bars of normal data . AP Biology Quantitative Skills Manual 66

Sample standard error bars (also known as the sample error of the sample mean) are the notations at the top of each shaded bar that shows the sample standard error (SE). AP Biology Quantitative Skills Manual 67

Scatterplots Used when comparing one measured variable against another. Used when looking for trends. AP Biology Quantitative Skills Manual 68

If the relationship is thought to be linear, a linear regression line or best fit line can be plotted to help define the pattern. AP Biology Quantitative Skills Manual 69

Box-and-Whisker Plots Allow graphical comparison of two samples of nonparametric data (data that do not fit a normal distribution). 70

In a box-and-whisker graph, the ticks at the tops and bottoms of the vertical lines show the highest and lowest values in the dataset, respectively. The top of each box shows the upper quartile, the bottom of each box shows the lower quartile, and the horizontal line represents the median. AP Biology Quantitative Skills Manual 71

Histograms (Frequency Diagrams) Used to display the distribution of data, providing a representation of the central tendencies and the spread of data. AP Biology Quantitative Skills Manual 72

Creating a histogram requires setting up bins — uniform range intervals that cover the entire range of the data. Then the number of measurements that fit in each bin are counted and graphed. AP Biology Quantitative Skills Manual 73

If the data on a histogram show an approximate normal distribution, then these are parametric data. If the data do not approximate a normal distribution then they are nonparametric data. AP Biology Quantitative Skills Manual 74

References: AP® Biology Investigative Labs: An Inquiry-Based Approach and AP® Biology Quantitative Skills: A Guide for Teachers