Outliers Because we square errors, a few points with large errors can have large effect on fitted response surface Both in simulations and in experiments.

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

Simple linear models Straight line is simplest case, but key is that parameters appear linearly in the model Needs estimates of the model parameters (slope.
Chapter 12 Inference for Linear Regression
Residuals.
Interpolate using an equation EXAMPLE 1 CD SINGLES The table shows the total number of CD single shipped (in millions) by manufacturers for several years.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Lecture 3 HSPM J716. New spreadsheet layout Coefficient Standard error T-statistic – Coefficient ÷ its Standard error.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #20.
Regression Diagnostics Checking Assumptions and Data.
Mathematical Modeling. What is Mathematical Modeling? Mathematical model – an equation, graph, or algorithm that fits some real data set reasonably well.
RLR. Purpose of Regression Fit data to model Known model based on physics P* = exp[A - B/(T+C)] Antoine eq. Assumed correlation y = a + b*x1+c*x2 Use.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
Correlation & Regression
Introduction to Linear Regression and Correlation Analysis
Residuals and Residual Plots Most likely a linear regression will not fit the data perfectly. The residual (e) for each data point is the ________________________.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
AP Statistics Section 15 A. The Regression Model When a scatterplot shows a linear relationship between a quantitative explanatory variable x and a quantitative.
$88.65 $ $22.05/A profit increase Improving Wheat Profits Eakly, OK Irrigated, Behind Cotton.
1.5 Cont. Warm-up (IN) Learning Objective: to create a scatter plot and use the calculator to find the line of best fit and make predictions. (same as.
Correlation & Regression – Non Linear Emphasis Section 3.3.
2014. Engineers often: Regress data  Analysis  Fit to theory  Data reduction Use the regression of others  Antoine Equation  DIPPR We need to be.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Regression relationship = trend + scatter
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
1 Multiple Regression A single numerical response variable, Y. Multiple numerical explanatory variables, X 1, X 2,…, X k.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Assignment (1) WEEK 3.
Line of Best fit, slope and y- intercepts MAP4C. Best fit lines 0 A line of best fit is a line drawn through data points that represents a linear relationship.
* SCATTER PLOT – GRAPH WITH MANY ORDERS PAIRS * LINE OF BEST FIT – LINE DRAWN THROUGH DATA THAT BEST REPRESENTS IT.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Linear Regression What kind of correlation would the following scatter plots have? Negative Correlation Positive Correlation No Correlation.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
Section 1.6 Fitting Linear Functions to Data. Consider the set of points {(3,1), (4,3), (6,6), (8,12)} Plot these points on a graph –This is called a.
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 12 1 MER301: Engineering Reliability LECTURE 12: Chapter 6: Linear Regression Analysis.
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Method 3: Least squares regression. Another method for finding the equation of a straight line which is fitted to data is known as the method of least-squares.
Scatter Plots & Lines of Best Fit To graph and interpret pts on a scatter plot To draw & write equations of best fit lines.
The Line of Best Fit CHAPTER 2 LESSON 3  Observed Values- Data collected from sources such as experiments or surveys  Predicted (Expected) Values-
10.2 Regression By the end of class you will be able to graph a regression line and use it to make predictions.
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 11 Simple Linear Regression and Correlation.
Correlation, Bivariate Regression, and Multiple Regression
LEAST – SQUARES REGRESSION
CHAPTER 3 Describing Relationships
AP Statistics Chapter 14 Section 1.
distance prediction observed y value predicted value zero
Population (millions)
Ch12.1 Simple Linear Regression
Linear Regression Bonus
…Don’t be afraid of others, because they are bigger than you
1. An example for using graphics
AP Stats: 3.3 Least-Squares Regression Line
Make interpolations and extrapolations related to how long it will take for the candle to burn to ____ cm tall or to completely burn out. line of.
Graphing Techniques.
The Least-Squares Line Introduction
Section 2: Linear Regression.
Lesson 2.2 Linear Regression.
Nonlinear Fitting.
Regression Statistics
Discrete Least Squares Approximation
CALCULATING EQUATION OF LEAST SQUARES REGRESSION LINE
Ch 4.1 & 4.2 Two dimensions concept
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
Lesson 2.2 Linear Regression.
Descriptive Statistics Univariate Data
Regression and Correlation of Data
Presentation transcript:

Outliers Because we square errors, a few points with large errors can have large effect on fitted response surface Both in simulations and in experiments there is potential for preposterous results due to failures of algorithms or tests Points with large deviations from fit are called outliers. Key question is how to distinguish between outliers that should be removed and ones that should be kept.

Weighted least squares Weighted least squares was developed to allow us to assign weights to data based on confidence or relevance. Most popular use of weighted least squares is for moving least squares, where we refit data for each prediction point with high weights for nearby data. Linear interpolation from a table is an extreme form. Error measure Normal equations

Determination of weights

Example x = (1:10)'; y = *x + randn(10,1); y(10) = 0; bls = regress(y,[ones(10,1) x]) brob = robustfit(x,y) bls = brob = >> scatter(x,y,'filled'); grid on; hold on plot(x,bls(1)+bls(2)*x,'r','LineWidth',2); plot(x,brob(1)+brob(2)*x,'g','LineWidth',2) legend('Data','Ordinary Least Squares','Robust Regression')