Download presentation

Presentation is loading. Please wait.

Published byJefferson Lynd Modified over 2 years ago

2
Estimating the accuracy of the approximation (surrogate) From assumption that error is due to normally distributed uncorrelated random variables, get estimate to error standard deviation (called standard error) Standard measure of accuracy Coefficient of multiple determination measures how much of variability in data is captured by approximation Adjusted coefficient of multiple determination accounts for the fitting bias

3
Curve fit noise=randn(1,30); x=1:1:30; y=x+noise 3.908 2.825 4.379 2.942 4.5314 5.7275 8.098 …………………………………25.84 27.47 27.00 30.96 [p,s]=polyfit(x,y,1); yfit=polyval(p,x); plot(x,y,'+',x,x,'r',x,yfit,'b') With dense data, functional form is clear. Fit serves to filter out noise

4
Example with y=0.1*x noise=randn(1,30); x=1:1:30; y=0.1*x+noise ; xx=[ones(30,1),x']; [B,BINT,R,RINT,STATS] = regress(y',xx) Stat 0.3016 12.0896 0.0017 1.7498

5
Estimating error in coefficients Some coefficients are more accurately estimated than others Standard error in coefficient is t-statistic is ratio of coefficient to standard error, would like it to be at least 2 Coefficients that are poorly estimated may be dropped to improve accuracy of predictions Dropping one coefficients changes t-statistics for others Need to iterate in dropping and adding coefficients

6
Regression in Excel (add-in data analysis) Rand Rand-0.5 x y fit error 0.7647420.26474211.2647421.035390.03539 0.258649-0.2413521.7586492.0311920.031192 0.7350260.23502633.2350263.0269940.026994 0.411036-0.0889643.9110364.0227970.022797 0.6749210.1749212424.1749223.93884-0.06116 0.694810.194812525.1948124.93465-0.06535 0.6479640.1479642626.1479625.93045-0.06955 0.407839-0.092162726.9078426.92625-0.07375 0.211674-0.288332827.7116727.92205-0.07795 0.405013-0.094992928.9050128.91786-0.08214 0.242633-0.257373029.7426329.91366-0.08634

7
Regression output SUMMARY OUTPUT Regression Statistics Multiple R0.999381 R Square0.998763 Adjusted R Square0.998719 Standard Error0.313962 Observations30 CoefficientsStandard Errort StatP-valueLower 95%Upper 95% Intercept0.0395870.1175700.3367110.738845-0.2012450.280419 X Variable 10.9958020.006623150.3646232.93E-420.9822371.009368

8
Output with y=0.1x SUMMARY OUTPUT Regression Statistics Multiple R0.969193 R Square0.939334 Adjusted R Square0.937168 Standard Error0.251021 Observations30 Coefficients Standard Errort StatP-valueLower 95%Upper 95% Intercept-0.190830.094-2.030120.051942-0.38340.0017 X Variable 10.110250.00529520.821771.41E-180.09940.1211

9
Example 3.2.1 Given data Use Microsoft Excel to fit linear and quadratic polynomials Compare standard errors and t-statistics of coefficients X-2012 Y-1.5 01.251.75

10
Linear fit

11
Quadratic fit

12
Graphical comparison.

13
Cross validation Error estimates based on model assumptions are vulnerable For polynomial response surface approximations assumptions are rarely satisfied Cross validation divides data into n g groups Fit the approximation to n g -1 groups, and use last group to estimate error. Repeat for each group When each group consists of one point, error called PRESS (prediction error sum of squares) Calculate error at each point and then presenting r.m.s error Can be shown that Can be used only if not ill-conditioned

14
Questions The pairs (0,0), (1,1), (2,1) represent strain (millistrains) and stress (ksi) measurements. Estimate Young modulus using the three commonly used error norms. Estimate the error in Young modulus using cross validation

Similar presentations

OK

GY2100 Geographical Data Analysis Lecture 4 Regression analysis and statistical inference DEPARTMENT OF GEOGRAPHY.

GY2100 Geographical Data Analysis Lecture 4 Regression analysis and statistical inference DEPARTMENT OF GEOGRAPHY.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google