Download presentation
Presentation is loading. Please wait.
Published byDwain Fields Modified over 9 years ago
1
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003© Duans S. Boning. 1
2
Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance Model forms 3. Regression Modeling Regression fundamentals Significance of model terms Confidence intervals Copyright 2003© Duans S. Boning. 2
3
Is Process B Better Than Process A? Copyright 2003© Duans S. Boning. 3
4
Two Means with Internal Estimate of Variance Copyright 2003© Duans S. Boning. 4
5
Comparison of Treatments Copyright 2003© Duans S. Boning. 5 Consider multiple conditions (treatments, settings for some variable) – There is an overall mean μ and real “ effects ” or deltas between conditions τ. – We observe samples at each condition of interest Key question: are the observed differences in mean “ significant ” ? - Typical assumption (should be checked): the underlying variances are all the same – usually an unknown value (σ02)
6
Steps/Issues in Analysis of Variance Copyright 2003© Duans S. Boning. 6 1. Within group variation – Estimates underlying population variance 2. Between group variation – Estimate group to group variance 3. Compare the two estimates of variance – If there is a difference between the different treatments, then the between group variation estimate will be inflated compared to the within group estimate – We will be able to establish confidence in whether or not observed differences between treatments are significant Hint: we’ll be using F tests to look at ratios of variances
7
(1) Within Group Variation Copyright 2003© Duans S. Boning. 7 Assume that each group is normally distributed and shares a common variance SS= sum of square deviations wgroup (there are k groups) Estimate of wthin group variance in tgroup (ust variance formula) Pool these (across different conditions) to get estimate of common thin group variance: This is the wthin group “mean square” variance estimate)
8
(2) Between Group Variation Copyright 2003© Duans S. Boning. 8 We will be testing hypothesis = … = If all the means are in fact equal, then a 2 estimate could be formed based on the observed differences between group means: If all the treatments in fact have different means, then estimates something larger:
9
(3) Compare Variance Estimates Copyright 2003© Duans S. Boning. 9 We now have two different posses for s depending on whether the observed sample mean differences are “real” or are just occurring by chance (by sampling) Use statistic to see if the ratios of these variances are likely to have occurred by chance! Formal test for significance:
10
(4) Compute Significance Level Copyright 2003© Duans S. Boning. 10 Calculate observed ratio (wth appropriate degrees of freedom in numerator and denominator) distribution to find how likely a ratio this large is to have occurred by chance alone -This is our “signifcance level” -If then we say that the mean differences or treatment effects are sgnificant to (1-)100% confidence or better
11
(5) Variance Due to Treatment Effects Copyright 2003© Duans S. Boning. 11 We also want to estimate the sum of squared deviations from the grand mean among all samples:
12
(6) Results: The ANOVA Table Copyright 2003© Duans S. Boning. 12 source of sum of degrees Variation squares of mean square freedom Between Treatments Within treatments Total about the grand average
13
Example: Anova Copyright 2003© Duans S. Boning. 13
14
ANOVA – Implied Model Copyright 2003© Duans S. Boning. 14 The ANOVA approach assumes a smple mathematical model: Where is the treatment mean (for treatment type t) And is the treatment effect With being zero mean normal residuals ~N Checks -Plot residuals against time order -Examne distribution of residuals: should be IID, Normal -Plot residuals vs. estimates -Plot residuals vs. other variables of interest
15
MANOVA – Two Dependencies Copyright 2003© Duans S. Boning. 15 Can extend to two (or more) variables of interest. MANOVA assumes a mathematical model, again simply capturing the means (or treatment offsets) for each discrete variable level: Assumes that the effects from the two variables are additive
16
Example: Two Factor MANOVA Copyright 2003© Duans S. Boning. 16 Two LPCVD depositon tube types, three gas suppers. Does supper matter in average particle counts on wafers? Experiment: 3 lots on each tube, for each gas; report average # particles added
17
MANOVA – Two Factors with Interactions Copyright 2003© Duans S. Boning. 17 May be interaction: not simpy additive – effects may depend synergistically on both factors: An effect that depends on both t & factors simultaneously t = first factor = 1,2, … k (k = #eves of first factor i = second factor = 1,2, … n (n = # levels of second factor) j = replication = 1,2, … m (m = # replications at t,th combination of factor levels Can split out the model more explicitly… Estimate by:
18
(6) Results: The ANOVA Table Copyright 2003© Duans S. Boning. 18 source of sum of degrees Variation squares of mean square freedom Between levels of factor 1 (T) Between levels of factor 2 (B) Interaction Within Groups (Error) Total about the grand average
19
Measures of Model Goodness – R Copyright 2003© Duans S. Boning. 19 Goodness of fit – R -Think of this as (1 – varance remaining in the residual) Recall Adjusted R -For “fair” comparison between models wth different numbers of coefficients, an alternatve is often used -Think of this as the fraction of squared deviationsfrom the grand average) in the data which is captured by the -Queston consdered: how much better does the model do thatust using the grand average?
20
Regression Fundamentals Copyright 2003© Duans S. Boning. 20 Use least square error as measure of goodness to estimate coefficients in a model One parameter model: – Model form – Squared error – Estimation using normal equations – Estimate of experimental error – Precision of estimate: variance in b – Confidence interval for β – Analysis of variance: significance of b – Lack of fit vs. pure error Polynomial regression
21
Regression Fundamentals Copyright 2003© Duans S. Boning. 21 We use least-squares to estimate coefficients in typical regression models One-Parameter Model: Goais to estimate th “best” b How define “best”? -That b whch mzes sum of squared error between predicton and data -The residua sum of squares (for the best estimate) is
22
Regression Fundamentals Copyright 2003© Duans S. Boning. 22 Least squares estimation via normal equations -For linear problems, we need not calcuate SS( ); rather, drect solution for b is possib -Recognize that vector of resduals will be normal to vector of x vaues at the least squares estimate Estimate of experimental error -Assumng model structure is adequate, estimate scan be obtained:
23
Precision of Estimate: Variance in b Copyright 2003© Duans S. Boning. 23 We can calculate the variance in our estimate of the slope, b: Why?
24
Confidence Interval for β Copyright 2003© Duans S. Boning. 24 Once we have the standard error in b, we can calculate confidence intervals to some desred (1-)100% level of confidence Analysis of variance -Test hypothess: -If confdence interval for includes 0, then β not significant -Degrees of freedom (need in order to use t distribution p = # parameters estmated by least squares
25
Example Regression Copyright 2003© Duans S. Boning. 25 Note that this smple model assumes an intercept of zero – model must go through origin We wll relax this requirement soon Age Income 8 6.16 22 9.88 35 14.35 40 24.06 57 30.34 73 32.17 78 42.18 87 43.23 98 48.76
26
Lack of Fit Error vs. Pure Error Copyright 2003© Duans S. Boning. 26 Sometimes we have replicated data -E.g. multiple runs at same x values in a designed experiment We can decompose the resdual error contributons Where = resdual sum of squares error = lack of ft squared error = pure replicate error This alows us to TEST for lack of fit -By “ lack of fit ” we mean evidence that the linear model form is inadequate
27
Regression: Mean Centered Models Copyright 2003© Duans S. Boning. 27 Model form
28
Regression: Mean Centered Models Copyright 2003© Duans S. Boning. 28 Confidence Intervals Our confidence interval on y widens as we get further from the center of our data1
29
Polynomial Regression Copyright 2003© Duans S. Boning. 29 We may bellwe thai a hlgher order model strudurm applk. Pdynomlal forms are also Ilnem in the coendents and mm be M wlth least squares Example: Growth rate data
30
Regression Example: Growth Rate Data Copyright 2003© Duans S. Boning. 30 R6ph~ated ata provKles opportunity to check for ladc of fit
31
Growth Rate - First Order Model Copyright 2003© Duans S. Boning. 31 Mean significant, but linear term not Clear evidence of lack of fit
32
Growth Rate - Second Order Model Copyright 2003© Duans S. Boning. 32 No evidence of lack of fit Quadratic term significant
33
Polynomial Regression In Excel Copyright 2003© Duans S. Boning. 33 Create additional input columns for each input Use "Data Analysis" and "Regression" tool
34
Polynomial Regression Copyright 2003© Duans S. Boning. 34 Generated using JMP package
35
Summary Copyright 2003© Duans S. Boning. 35 Comparison of Treatments – ANOVA Multivariate Analysis of Variance Regression Modeling Next Time Time Series Models Forecasting
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.