Data Analysis, Presentation, and Statistics

Slides:



Advertisements
Similar presentations
Chapter 9: Simple Regression Continued
Advertisements

Lesson 10: Linear Regression and Correlation
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Correlation and Regression
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Statistics: Data Analysis and Presentation Fr Clinic II.
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
The Simple Regression Model
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
Statistics: Data Presentation & Analysis Fr Clinic I.
Chapter Topics Types of Regression Models
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
In this tutorial you will learn how to go from THIS.
Simple Linear Regression and Correlation
12.3 – Measures of Dispersion
Dr. Hong Zhang.  Tables and Graphs  Populations and Samples  Mean, Median, and Standard Deviation  Standard Error & 95% Confidence Interval (CI) 
Simple Linear Regression Analysis
Hydrologic Statistics
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Data Collection & Processing Hand Grip Strength P textbook.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Simple Linear Regression Models
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 15 Correlation and Regression
Bivariate Regression (Part 1) Chapter1212 Visual Displays and Correlation Analysis Bivariate Regression Regression Terminology Ordinary Least Squares Formulas.
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Statistical Analysis Topic – Math skills requirements.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Statistical analysis. Types of Analysis Mean Range Standard Deviation Error Bars.
Statistical Analysis Topic – Math skills requirements.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
EXCEL DECISION MAKING TOOLS AND CHARTS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Freshman Engineering Clinic II
Statistical analysis.
Basic Estimation Techniques
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Lecture Slides Elementary Statistics Thirteenth Edition
Basic Estimation Techniques
Correlation and Regression
Descriptive and Inferential
Simple Linear Regression
Simple Linear Regression
Simple Linear Regression
15.1 The Role of Statistics in the Research Process
HIMS 650 Homework set 5 Putting it all together
Chapter Thirteen McGraw-Hill/Irwin
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Data Analysis, Presentation, and Statistics Fr Clinic I

Overview Tables and Graphs Populations and Samples Mean, Median, and Standard Deviation Standard Error & 95% Confidence Interval (CI) Error Bars Comparing Means of Two Data Sets Linear Regression (LR)

Warning Statistics is a huge field, I’ve simplified considerably here. For example: Mean, Median, and Standard Deviation There are alternative formulas Standard Error and the 95% Confidence Interval There are other ways to calculate CIs (e.g., z statistic instead of t; difference between two means, rather than single mean…) Error Bars Don’t go beyond the interpretations I give here! Linear Regression We only look at simple LR and only calculate the intercept, slope and R2. There is much more to LR!

Should I Use a Table or Graph? Tables Presenting large amount of different data Comparing multiple characteristics Graphs Visual presentation quickly gives information Compare one or two characteristics Showing trends

Tables 4 5 12 Consistent Format, Title, Units, Big Fonts Table 1: Average Turbidity and Color of Water Treated by Portable Water Filters 4 5 12 Consistent Format, Title, Units, Big Fonts Differentiate Headings, Number Columns

Figures Consistent Format, Title, Units Good Axis Titles, Big Fonts 20 10 7 5 1 11 11 Figure 1: Turbidity of Pond Water, Treated and Untreated

Graphing Suggestions 1, 2, 5 rule – Set gradations so smallest division of the axis is a positive integer power of 10 times 1, 2, or 5. Huh? Set your scale up so that the smallest division is an integer increment.

Graphing Suggestions Labels Points, lines, curves All axes should be labeled Include units on the label Points, lines, curves Play around with options Color can be your friend Color can be your enemy

Populations and Samples All of the possible outcomes of experiment or observation US population Particular type of steel beam Sample A finite number of outcomes measured or observations made 1000 US citizens 5 beams We use samples to estimate population properties Mean, Variability (e.g. standard deviation), Distribution Height of 1000 US citizens used to estimate mean of US population

Mean and Median Turbidity of Treated Water (NTU) 1 3 6 8 10 Mean = Sum of values divided by number of samples = (1+3+3+6+8+10)/6 = 5.2 NTU 1 3 6 8 10 Median = The middle number Rank - 1 2 3 4 5 6 Number - 1 3 3 6 8 10 For even number of sample points, average middle two = (3+6)/2 = 4.5 Excel: Mean – AVERAGE; Median - MEDIAN

Variance Measure of variability sum of the square of the deviation about the mean divided by degrees of freedom n = number of data points Excel: variance – VAR

Standard Deviation, s Square-root of the variance For phenomena following a Normal Distribution (bell curve), 95% of population values lie within 1.96 standard deviations of the mean Area under curve is probability of getting value within specified range -1.96 1.96 95% Excel: standard deviation – STDEV Standard Deviations from Mean

Standard Error of Mean Standard deviation of mean Of sample of size n taken from population with standard deviation s Estimate of mean depends on sample selected As n , variance of mean estimate goes down, i.e., estimate of population mean improves As n , mean estimate distribution approaches normal, regardless of population distribution

95% Confidence Interval (CI) for Mean Interval within which we are 95 % confident the true mean lies t95%,n-1 is t-statistic for 95% CI if sample size = n If n  30, let t95%,n-1 = 1.96 (Normal Distribution) Otherwise, use Excel formula: TINV(0.05,n-1) n = number of data points

Error Bars Show data variability on plot of mean values Types of error bars include: ± Standard Deviation, ± Standard Error, ± 95% CI Maximum and minimum value

Using Error Bars to compare data Standard Deviation Demonstrates data variability, but no comparison possible Standard Error If bars overlap, any difference in means is not statistically significant If bars do not overlap, indicates nothing! 95% Confidence Interval If bars overlap, indicates nothing! If bars do not overlap, difference is statistically significant We’ll use 95 % CI

Example 1 Create Bar Chart of Name vs Mean. Right click on data. Select “Format Data Series”.

Example 2

Linear Regression Fit the best straight line to a data set Right-click on data point and use “trendline” option. Use “options” tab to get equation and R2.

R2 - Coefficient of multiple Determination ŷi = Predicted y values, from regression equation yi = Observed y values R2 = fraction of variance explained by regression (variance = standard deviation squared) = 1 if data lies along a straight line