© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions.

Slides:



Advertisements
Similar presentations
Chapter 12 Simple Linear Regression
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Linear regression models
Ch11 Curve Fitting Dr. Deshi Ye
Objectives (BPS chapter 24)
Chapter 12 Simple Linear Regression
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Chapter 10 Simple Regression.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
BA 555 Practical Business Analysis
1 Pertemuan 13 Uji Koefisien Korelasi dan Regresi Matakuliah: A0392 – Statistik Ekonomi Tahun: 2006.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Lecture 4 Page 1 CS 239, Spring 2007 Models and Linear Regression CS 239 Experimental Methodologies for System Software Peter Reiher April 12, 2007.
SIMPLE LINEAR REGRESSION
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
Introduction to Regression Analysis, Chapter 13,
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
SIMPLE LINEAR REGRESSION
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Correlation and Linear Regression
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
© 1998, Geoff Kuenning General 2 k Factorial Designs Used to explain the effects of k factors, each with two alternatives or levels 2 2 factorial designs.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Other Regression Models Andy Wang CIS Computer Systems Performance Analysis.
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Go to Table of Content Single Variable Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Chapter 5: Regression Analysis Part 1: Simple Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Simple Linear Regression ANOVA for regression (10.2)
Data Analysis Overview Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Workload parameters System Config.
CPE 619 One Factor Experiments Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama in.
Experiment Design Overview Number of factors 1 2 k levels 2:min/max n - cat num regression models2k2k repl interactions & errors 2 k-p weak interactions.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Linear Regression Models Andy Wang CIS Computer Systems Performance Analysis.
© 1998, Geoff Kuenning Comparison Methodology Meaning of a sample Confidence intervals Making decisions and comparing alternatives Special considerations.
ENGR 610 Applied Statistics Fall Week 11 Marshall University CITE Jack Smith.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Chapter 20 Linear and Multiple Regression
Regression Analysis AGEC 784.
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Linear Regression Models
Statistical Methods For Engineers
Prepared by Lee Revere and John Large
Simple Linear Regression
Replicated Binary Designs
Regression Assumptions
Simple Linear Regression
Regression Assumptions
Presentation transcript:

© 1998, Geoff Kuenning Linear Regression Models What is a (good) model? Estimating model parameters Allocating variation Confidence intervals for regressions Verifying assumptions visually

Data Analysis Overview Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Workload parameters System Config parameters Factor levels Raw Data Samples Predictor values x, factor levels Samples of response... y1y2yny1y2yn Regression models response var = f (predictors)

© 1998, Geoff Kuenning What Is a (Good) Model? For correlated data, model predicts response given an input Model should be equation that fits data Standard definition of “fits” is least-squares –Minimize squared error –While keeping mean error zero –Minimizes variance of errors

© 1998, Geoff Kuenning Least-Squared Error If then error in estimate for x i is Minimize Sum of Squared Errors (SSE) Subject to the constraint. e i = y i - y i.

© 1998, Geoff Kuenning Estimating Model Parameters Best regression parameters are where Note error in eq book!

© 1998, Geoff Kuenning Parameter Estimation Example Execution time of a script for various loop counts: = 6.8, = 2.32,  xy = 88.54,  x 2 = 264 b 0 = 2.32  (0.29)(6.8) = 0.35

© 1998, Geoff Kuenning Graph of Parameter Estimation Example

© 1998, Geoff Kuenning Variants of Linear Regression Some non-linear relationships can be handled by transformations –For y = ae bx take logarithm of y, do regression on log(y) = b 0 +b 1 x, let b = b 1, –For y = a+b log(x), take log of x before fitting parameters, let b = b 1, a = b 0 –For y = ax b, take log of both x and y, let b = b 1,

© 1998, Geoff Kuenning Allocating Variation If no regression, best guess of y is Observed values of y differ from, giving rise to errors (variance) Regression gives better guess, but there are still errors We can evaluate quality of regression by allocating sources of errors

© 1998, Geoff Kuenning The Total Sum of Squares Without regression, squared error is

© 1998, Geoff Kuenning The Sum of Squares from Regression Recall that regression error is Error without regression is SST So regression explains SSR = SST - SSE Regression quality measured by coefficient of determination.

© 1998, Geoff Kuenning Evaluating Coefficient of Determination Compute

© 1998, Geoff Kuenning Example of Coefficient of Determination For previous regression example –  y = 11.60,  y 2 = 29.79,  xy = 88.54, –SSE = (0.35)(11.60)-(0.29)(88.54) = 0.05 –SST = = 2.89 –SSR = = 2.84 –R 2 = ( )/2.89 = 0.98

© 1998, Geoff Kuenning Standard Deviation of Errors Variance of errors is SSE divided by degrees of freedom –DOF is n  2 because we’ve calculated 2 regression parameters from the data –So variance (mean squared error, MSE) is SSE/(n  2) Standard deviation of errors is square root:

© 1998, Geoff Kuenning Checking Degrees of Freedom Degrees of freedom always equate: –SS0 has 1 (computed from ) –SST has n  1 (computed from data and, which uses up 1) –SSE has n  2 (needs 2 regression parameters) –So

© 1998, Geoff Kuenning Example of Standard Deviation of Errors For our regression example, SSE was 0.05, so MSE is 0.05/3 = and s e = 0.13 Note high quality of our regression: –R 2 = 0.98 –s e = 0.13 –Why such a nice straight-line fit?

© 1998, Geoff Kuenning Confidence Intervals for Regressions Regression is done from a single population sample (size n) –Different sample might give different results –True model is y =  0 +  1 x –Parameters b 0 and b 1 are really means taken from a population sample

© 1998, Geoff Kuenning Calculating Intervals for Regression Parameters Standard deviations of parameters: Confidence intervals are where t has n - 2 degrees of freedom !

© 1998, Geoff Kuenning Example of Regression Confidence Intervals Recall s e = 0.13, n = 5,  x 2 = 264, = 6.8 So Using a 90% confidence level, t 0.95;3 = 2.353

© 1998, Geoff Kuenning Regression Confidence Example, cont’d Thus, b 0 interval is –Not significant at 90% And b 1 is –Significant at 90% (and would survive even 99.9% test) ! !

© 1998, Geoff Kuenning Confidence Intervals for Nonlinear Regressions For nonlinear fits using exponential transformations: –Confidence intervals apply to transformed parameters –Not valid to perform inverse transformation on intervals

© 1998, Geoff Kuenning Confidence Intervals for Predictions Previous confidence intervals are for parameters –How certain can we be that the parameters are correct? Purpose of regression is prediction –How accurate are the predictions? –Regression gives mean of predicted response, based on sample we took

© 1998, Geoff Kuenning Predicting m Samples Standard deviation for mean of future sample of m observations at x p is Note deviation drops as m  Variance minimal at x = Use t-quantiles with n–2 DOF for interval

© 1998, Geoff Kuenning Example of Confidence of Predictions Using previous equation, what is predicted time for a single run of 8 loops? Time = (8) = 2.67 Standard deviation of errors s e = % interval is then !

© 1998, Geoff Kuenning Verifying Assumptions Visually Regressions are based on assumptions: –Linear relationship between response y and predictor x Or nonlinear relationship used in fitting –Predictor x nonstochastic and error-free –Model errors statistically independent With distribution N(0,c) for constant c If assumptions violated, model misleading or invalid

© 1998, Geoff Kuenning Testing Linearity Scatter plot x vs. y to see basic curve type LinearPiecewise Linear OutlierNonlinear (Power)

© 1998, Geoff Kuenning Testing Independence of Errors Scatter-plot  i versus Should be no visible trend Example from our curve fit:

© 1998, Geoff Kuenning More on Testing Independence May be useful to plot error residuals versus experiment number –In previous example, this gives same plot except for x scaling No foolproof tests

© 1998, Geoff Kuenning Testing for Normal Errors Prepare quantile-quantile plot Example for our regression:

© 1998, Geoff Kuenning Testing for Constant Standard Deviation Tongue-twister: homoscedasticity Return to independence plot Look for trend in spread Example:

© 1998, Geoff Kuenning Linear Regression Can Be Misleading Regression throws away some information about the data –To allow more compact summarization Sometimes vital characteristics are thrown away –Often, looking at data plots can tell you whether you will have a problem

© 1998, Geoff Kuenning Example of Misleading Regression IIIIIIIV xyxyxyxy

© 1998, Geoff Kuenning What Does Regression Tell Us About These Data Sets? Exactly the same thing for each! N = 11 Mean of y = 7.5 Y = X Standard error of regression is All the sums of squares are the same Correlation coefficient =.82 R 2 =.67

© 1998, Geoff Kuenning Now Look at the Data Plots III IIIIV