Institute for Tribal Environmental Professionals Tribal Air Monitoring Support Center Melinda Ronca-Battista Brenda Sakizzie Jarrell Southern Ute Indian.

Slides:



Advertisements
Similar presentations
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Advertisements

Nursing Home Falls Control Chart Problem Statement: Assume that following data were obtained about number of falls in a Nursing Home facility. Produce.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Analysis of the Rainfall of Tropical Storm Allison Creating an Excel Model Creating an Excel Model.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
Spreadsheets and Non- Spatial Databases Unit 4: Module 15, Lecture 2- Advanced Microsoft Excel.
Chapter 13 Multiple Regression
Chapter 10 Simple Regression.
BA 555 Practical Business Analysis
Chapter 12a Simple Linear Regression
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
REGRESSION MODEL ASSUMPTIONS. The Regression Model We have hypothesized that: y =  0 +  1 x +  | | + | | So far we focused on the regression part –
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Multiple Regression – Basic Relationships
Introduction to Excel 2007 Part 1: Basics and Descriptive Statistics Psych 209.
Regression in EXCEL r2 SSE b0 b1 SST.
Assumption of Homoscedasticity
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
1 Guest Speaker: Bill Frietsche US EPA.  April 7: QA Systems, EPA definitions, PQAOs and common sense – Mike Papp  April 14: Routine Quality Control.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Inference for regression - Simple linear regression
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Air Quality System Precision and Accuracy Data Transaction Generator (AQSP&A) Training Session.
Inferences for Regression
Quality Assurance/ Quality Control
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Data Analysis Lab 04 Regression and Multiple Regression.
AP Statistics Chapter 8 & 9 Day 3
Quality Control/ Quality Assurance Annabelle Allison ITEP/TAMS Center.
Quality Control – Part II Tim Hanley EPA Office of Air Quality Planning and Standards.
Appendix A: Calculations for Data Quality Assessment QC check statistics Precision calcs Bias calcs PM stats Reporting: quarterly and annual.
Examining Relationships in Quantitative Research
Data Submittals to AQS Nate Herbst Southern Ute Indian Tribe.
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
Module 8 Regression and Correlation Module 8. module 8 Relationship between two variables Changing wind speed, humidity, or other met parameters, and.
1 Tribal Data Toolbox Overview Melinda Ronca-Battista Angelique Luedeker ITEP.
Excel: Pivot Tables Exploring Computer Science Lesson Supplemental.
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
Unit 42 : Spreadsheet Modelling
Statistical Analysis Topic – Math skills requirements.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
ANOVA, Regression and Multiple Regression March
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
LSP 120: Quantitative Reasoning and Technological Literacy Topic 1: Introduction to Quantitative Reasoning and Linear Models Lecture Notes 1.2 Prepared.
D/RS 1013 Data Screening/Cleaning/ Preparation for Analyses.
Excel part 5 Working with Excel Tables, PivotTables, and PivotCharts.
Model Trendline Linear 2Y1X Excel 2013 V0D 1 by Milo Schield Member: International Statistical Institute US Rep: International Statistical Literacy Project.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
EXCEL DECISION MAKING TOOLS AND CHARTS BASIC FORMULAE - REGRESSION - GOAL SEEK - SOLVER.
Regression Chapter 5 January 24 – Part II.
Correlation and Regression Ch 4. Why Regression and Correlation We need to be able to analyze the relationship between two variables (up to now we have.
(Slides not created solely by me – the internet is a wonderful tool) SW388R7 Data Analysis & Compute rs II Slide 1.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Chapter 15 Inference for Regression. How is this similar to what we have done in the past few chapters?  We have been using statistics to estimate parameters.
Exploring Computer Science Lesson Supplemental
Final Data Review and Invalidation
(Residuals and
CHAPTER 29: Multiple Regression*
Access Tutorial 5 Creating Advanced Queries and Enhancing Table Design
Access Tutorial 5 Creating Advanced Queries and Enhancing Table Design
Regression III.
Precision, Bias, and Total Error (Accuracy)
Inference for Regression
Introduction to Excel 2007 Part 1: Basics and Descriptive Statistics Psych 209.
Presentation transcript:

Institute for Tribal Environmental Professionals Tribal Air Monitoring Support Center Melinda Ronca-Battista Brenda Sakizzie Jarrell Southern Ute Indian Tribe

 Assisting tribes all over country  Similar questions are asked  All evaluations begin the same  Need for free (MS Office) tools  Applicable to any data  Supplements and uses existing EPA tools, helping tribes use AMTIC and have confidence of the interpretation of their data

AAA Data Analysis and Interpretation folders

1. Clean up your data. 2. Verify QC limits met 3. Aggregate data into sets 4. Find Patterns 5. Ask your question 6. Evaluate the shapes of the distributions 7. Apply the test

 1. Initial Cleaning (checking links, hidden values, finding repeated rows, tracking data so none is lost)  2. Normalization (separating information into separate fields, using data validation to limit entries to drop- down lists)  3. Documenting Your Clean Data  4. General cleaning (macros, values- only, documentation, eliminating hidden characters) Data Analysis, Step 1-Data Clean Up folder

 Don’t exert any effort on bad data— start with a quick review of QC  Review logbooks, audits  Generate AQS report of QC data, and  Use EPA’s DASC tool; enter data into the spreadsheet and review PLOTS Data Analysis, Step 2-QC folder

Plots generated using EPA’s DASC tool:

 Hourly values slow down computer  Use MS Access Quick Start guide to aggregate data into chunks of daily or weekly averages  MS Access easier than Excel for handling missing data, excluding codes Data Analysis, Step 3-Aggregate Data folder, Tribal Data Access Quick Start subfolder

 5-day averages of daily max O3 values  Reduces number of data points from >10,000s of hourly values to ~ 1000s of daily values to ~100s of 5-day averages  Enables the next step of graphing against relevant parameters (temp, solar radiation, NO2, etc.)

 Apply common sense to the data  How does it vary with met parameters?  How does it vary with other pollutants?

 Dynamic Named Ranges  as your data increase, plots and summary statistics are automatically recalculated  See pg. 7 of doc “using ranges in excel and graphing.doc”  Autofilter  Filter out or in data  1-click recalculation of plots of different subsets

 Use x-y plot (ALWAYS SCATTERPLOT NEVER DATE because that produces category plots)  Use secondary axis for 2 nd parameter so it can have its own units  Start with all data, then use Excel Autofilter to find subsets of data where both parameters have values, that show a pattern, clicking on different values that are immediately graphed

Plot 2 nd parameter on its own axis

ALWAYS plot on x-y scatterplots-never dates

 Use linear regression between parameters  Perfect 1-to-1 relationship with one rise on y-axis to every one run on x- axis shows:  Slope ~ 1 and RSQ (r 2 ) ~ 1  Can calculate in plot or using functions

 =slope(Ys,Xs) and =RSQ(Ys, Xs)  OR  Scatterplot, show trendline, show equation and R2 on chart  For the case of our correlation between O3 and temp, how well do they compare?

Insert regression line on x-y scatterplots, or use slope= and RSQ= functions

=slope(Ys,Xs) and =RSQ(Ys, Xs)

 Ex: When analyzing this data, we saw a shift in how the 2 sites’ O3 tracked  During one time period, one site had markedly higher O3 levels than the other site, but the rest of the time the 2 sites agreed well

 -- ▲ --Ute 1, -- ▀ -- Ute 3

 In this case, the (Ute 3- Ute 1)/Ute 3 ratio:

 Is this helpful?  Sort of... VALIDINVALID Mean Variance

 See BoxCharter.zip for Excel Add-In to generate box and whisker plots INVALID VALID

 Decide on your assumption that the data must be used to prove wrong--the null hypothesis  If you think the data from 2 sites are different, then assume that the difference between them is zero and then the data must prove that wrong, at some level of confidence  the null hypo is that there is zero difference between the means of the datasets

 Are they normal? “approximately”?  If so, the tests are easier  Hmmmmm. VALID INVALID

Straight line is “perfectly normal” …

 The “invalid” dataset was removed from AQS  New audits were conducted, a 2 nd analyzer was collocated with Ute 1, and now data from that site are deemed “good”  Southern Ute Indian Tribe is very careful with all data, and this story shows how good QC and careful data analysis yields confidence in decisions

AAA Data Analysis and Interpretation folders