Statistics: Data Presentation & Analysis Fr Clinic I.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Lesson 10: Linear Regression and Correlation
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Correlation and Regression
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 10 Simple Regression.
Statistics: Data Analysis and Presentation Fr Clinic II.
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
The Simple Regression Model
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
12.3 – Measures of Dispersion
Dr. Hong Zhang.  Tables and Graphs  Populations and Samples  Mean, Median, and Standard Deviation  Standard Error & 95% Confidence Interval (CI) 
Simple Linear Regression Analysis
Hydrologic Statistics
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Correlation Scatter Plots Correlation Coefficients Significance Test.
Linear Regression and Correlation
Statistics for clinical research An introductory course.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Statistical Analysis Topic – Math skills requirements.
MATH IN THE FORM OF STATISTICS IS VERY COMMON IN AP BIOLOGY YOU WILL NEED TO BE ABLE TO CALCULATE USING THE FORMULA OR INTERPRET THE MEANING OF THE RESULTS.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
Statistical Analysis Topic – Math skills requirements.
PCB 3043L - General Ecology Data Analysis.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Data Analysis, Presentation, and Statistics
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 4 Exploring Chemical Analysis, Harris
Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.
Freshman Engineering Clinic II
Statistical analysis.
Chapter 4 Basic Estimation Techniques
AP Biology Intro to Statistics
Part 5 - Chapter
Basic Estimation Techniques
Statistical analysis.
PCB 3043L - General Ecology Data Analysis.
Lecture Slides Elementary Statistics Thirteenth Edition
Basic Estimation Techniques
Correlation and Regression
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
Descriptive and Inferential
Simple Linear Regression
SIMPLE LINEAR REGRESSION
15.1 The Role of Statistics in the Research Process
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Statistics: Data Presentation & Analysis Fr Clinic I

Overview Tables & Graphs Populations & Samples Mean, Median, & Variance Error Bars – Standard Deviation, Standard Error & 95% Confidence Interval (CI) Comparing Means of Two Populations Linear Regression (LR)

Warning Statistics is a huge field, I’ve simplified considerably here. For example: – Mean, Median, and Standard Deviation There are alternative formulas – 95% Confidence Interval There are other ways to calculate CIs (e.g., z statistic instead of t; difference between two means, rather than single mean…) – Error Bars Don’t go beyond the interpretations I give here! – Comparing Means of Two Data Sets We just cover the t test for two means when the variances are unknown but equal, there are other tests – Linear Regression We only look at simple LR and only calculate the intercept, slope and R 2. There is much more to LR!

Tables Table 1: Average Turbidity and Color of Water Treated by Portable Water Filters Consistent Format, Title, Units, Big Fonts Differentiate Headings, Number Columns

Figures 11 Figure 1: Turbidity of Pond Water, Treated and Untreated Consistent Format, Title, Units Good Axis Titles, Big Fonts

Populations and Samples Population – All possible outcomes of experiment or observation US population Particular type of steel beam Sample – Finite number of outcomes measured or observations made 1000 US citizens 5 beams Use samples to estimate population properties – Mean, Variance E.g., Height of 1000 US citizens used to estimate mean of US population

Central Tendency Mean and Median Mean = xbar = Sum of values divided by sample size = ( )/6 = 5.2 NTU Median = m = Middle number Rank Number For even number of sample points, average middle two = (3+6)/2 = Excel: Mean – AVERAGE; Median - MEDIAN

Variability Variance, s 2 – sum of the square of the deviation about the mean divided by degrees of freedom – s 2 = n (x i – xbar) 2 /(n-1) – Where x i = a data point and n = number of data points Example (cont.) – s 2 = [(1-5.2) 2 + (3-5.2) 2 + (3-5.2) ) 2 + (8- 5.2) 2 + (10-5.2) 2 ] /(6-1) = 11.8 NTU 2 Excel: Variance – VAR

Error Bars Show data variability on plot of mean values Types of error bars include: Max/min, ± Standard Deviation, ± Standard Error, ± 95% CI

Standard Deviation, s Square-root of variance If phenomena follows Normal Distribution (bell curve), 95% of population lies within 1.96 standard deviations of the mean Error bar is s above & below mean % Standard Deviations from Mean Excel: standard deviation – STDEV

Standard Error of Mean Also called St-Err or s xbar For sample of size n taken from population with standard deviation estimated as s As n ↑, s xbar estimate↓, i.e., estimate of population mean improves Error bar is St-Err above & below mean

95% Confidence Interval (CI) for Mean A 95% Confidence Interval is expected to contain the population mean 95 % of the time (i.e., of 95%-CIs from 100 samples, 95 will contain pop mean) t 95%,n-1 is a statistic for 95% CI from sample of size n – t 95%,n-1 = TINV(0.05,n-1) – If n  30, t 95%,n-1 ≈ 1.96 (Normal Distribution) Error bar is above & below mean

Using Error Bars to compare data Standard Deviation – Demonstrates data variability, but no comparison possible Standard Error – If bars overlap, any difference in means is not statistically significant – If bars do not overlap, indicates nothing! 95% Confidence Interval – If bars overlap, indicates nothing! – If bars do not overlap, difference is statistically significant We’ll use 95 % CI in this class – Any time you have 3 or more data points, determine mean, standard deviation, standard error, and t 95%,n-1, then plot mean with error bars showing the 95% confidence interval

Adding Error Bars to an Excel Graph Create Graph – Column, scatter,… Select Data Series In Layout Tab-Analysis Group, select Error Bars Select More Error Bar Options Select Custom and Specify Values and select cells containing the values

Example 1: 95% CI

What can we do? Lift weight multiple times using different solar panel combinations (or hyrdoturbines, or gear boxes) and plot mean and 95 % Confidence interval error bars. – If error bars overlap between to different test conditions, indicates nothing! – If error bars do not overlap, difference is statistically significant

T Test A more sophisticated way to compare means Use t test to determine if means of two populations are different E.g., lift times with different solar panel combinations or turbines or…

Comparing Two Data Sets using the t test Example - You lift weight with two panels in series and two in parallel. – Series: Mean = 2 min, s = 0.5 min, n = 20 – Parallel: Mean = 3 min, s = 0.6 min, n = 20 You ask the question - Do the different panel combinations result in different lift times? – Different in a statistically significant way

Are the Lift Times Different? Use TTEST (Excel) Fractional probability of being wrong if you claim the two populations are different – We’ll say they are significantly different if probability is ≤ 0.05

Marbles

Linear Regression Fit the best straight line to a data set Right-click on data point and select “trendline”. Select options to show equation and R 2.

R 2 - Coefficient of multiple Determination R 2 = n (ŷ i - ybar) 2 / n (y i - ybar) 2 – ŷ i = Predicted y values, from regression equation – y i = Observed y values – Ybar = mean of y R 2 = fraction of variance explained by regression – R 2 = 1 if data lies along a straight line