Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.

Similar presentations


Presentation on theme: "1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL."— Presentation transcript:

1 1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL

2 2 y i =  0 +  1 x i + e i V( e i ) =  2 Variance Models with Simple Linear Regression y i =  0 +  1 x i + e i y i =  0 +  1 x i + e i V( e i ) =  2 (  0 +  1 x i ) 2 y = X  + Zu

3 3.clear. set obs 100. gen x=10*uniform(). gen mu = 1+.5*x. replace y=mu+.10*mu*invnorm(uniform()) Example:  0 = 1.0,  1 = 0.5,  = 0.10 y i =  0 +  1 x i + e i V(e i ) =  2 (  0 +  1 x i ) 2 Problem: Estimate  0,  1, and 

4 4 Variance Stabilization y i =  0 +  1 x i + e i V(e i ) =  2 (  0 +  1 x i ) 2 But E(log y i ) = g(     1, x i )

5 5 Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 2, 97) = 630.65 Model | 15.9635043 2 7.98175216 Prob > F = 0.0000 Residual | 1.22767564 97.01265645 R-squared = 0.9286 -------------+------------------------------ Adj R-squared = 0.9271 Total | 17.19118 99.173648282 Root MSE =.1125 ------------------------------------------------------------------------------ z | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.2707448.018873 14.35 0.000.2332872.3082025 x2 | -.0110946.0017622 -6.30 0.000 -.0145921 -.0075972 _cons |.1783796.0441957 4.04 0.000.0906634.2660958 ------------------------------------------------------------------------------. gen z = log(y). gen x2 = x*x. reg z x x2. predict z_hat Approximate g(     1, x i ) by polynomial in x, then do OLS regression of log y on x:  0 = 1.0,  1 = 0.5,  = 0.10

6 6 But what about  0 and  1 ?

7 7 reg y x predict muh reg y x [w=1/muh^2].local rmse=e(rmse).gen wt = 1/muh^2.summ wt.local wbar=r(mean).local sigh = sqrt(`wbar’)*`rmse’ Alternative: Iteratively re-weighted regression

8 8 ITERATION 0. reg y x Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 1, 98) = 832.95 Model | 172.307709 1 172.307709 Prob > F = 0.0000 Residual | 20.272763 98.206864928 R-squared = 0.8947 -------------+------------------------------ Adj R-squared = 0.8937 Total | 192.580472 99 1.94525729 Root MSE =.45482 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.518296.0179585 28.86 0.000.482658.5539339 _cons |.9530661.1014154 9.40 0.000.7518106 1.154322 ------------------------------------------------------------------------------. gen wt = 1/(_b[_cons] + _b[x]*x)^2

9 9 ITERATION 1. reg y x [w=wt] (analytic weights assumed) (sum of wgt is 1.3046e+01) Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 1, 98) = 1278.10 Model | 123.561274 1 123.561274 Prob > F = 0.0000 Residual | 9.47421712 98.096675685 R-squared = 0.9288 -------------+------------------------------ Adj R-squared = 0.9281 Total | 133.035492 99 1.34379284 Root MSE =.31093 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.510555.014281 35.75 0.000.4822147.5388952 _cons |.9877915.0533903 18.50 0.000.8818402 1.093743 ------------------------------------------------------------------------------. replace wt = 1/(_b[_cons] + _b[x]*x)^2 (100 real changes made)

10 10 ITERATION 2. reg y x [w=wt] (analytic weights assumed) (sum of wgt is 1.2842e+01) Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 1, 98) = 1271.77 Model | 124.885941 1 124.885941 Prob > F = 0.0000 Residual | 9.62343467 98.098198313 R-squared = 0.9285 -------------+------------------------------ Adj R-squared = 0.9277 Total | 134.509376 99 1.35868057 Root MSE =.31337 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.510746.0143219 35.66 0.000.4823247.5391674 _cons |.9870233.0540361 18.27 0.000.8797904 1.094256 ------------------------------------------------------------------------------. replace wt = 1/(_b[_cons] + _b[x]*x)^2 (100 real changes made)

11 11 ITERATION 3. noi reg y x [w=wt] (analytic weights assumed) (sum of wgt is 1.2846e+01) Source | SS df MS Number of obs = 100 -------------+------------------------------ F( 1, 98) = 1271.92 Model | 124.855967 1 124.855967 Prob > F = 0.0000 Residual | 9.6200215 98.098163485 R-squared = 0.9285 -------------+------------------------------ Adj R-squared = 0.9277 Total | 134.475988 99 1.35834331 Root MSE =.31331 ------------------------------------------------------------------------------ y | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.5107417.0143209 35.66 0.000.4823223.5391612 _cons |.9870406.0540213 18.27 0.000.8798371 1.094244 ------------------------------------------------------------------------------. summ wt Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- wt | 100.1284623.1209472.0269917.5841107. local wbar=r(mean). noi di e(rmse)*sqrt(`wbar').11229563  0 = 1.0,  1 = 0.5,  = 0.10

12 12 Can we do this using -xtmixed- ?. xtmixed y x ||???: x How do we get –xtmixed- to estimate a non-constant residual variance? Degenerate dependency of random effects (u 0i = u 1i ). Coefficients of random intercept and slope (c 0 and c 1 ) need to be constrained. y i =  0 +  1 x i +  (  0 +  1 x i )u i =  0 +  1 x i + c 0 u 0i + c 1 x i u 1i where u 0i = u 1i and c 1 /c 0 =  1 /  0

13 13 y i =  0 +  1 x i + c 0 u 0i + c 1 x i u 1i Can we do this using -xtmixed- ? set obs 1000 gen x = 5*uniform() gen mu = 3+1.4*x gen u0=invnorm(uniform()) gen u1=invnorm(uniform()) gen y = mu + 0.05*u0 + 0.50*x*u1 How do we get –xtmixed- to estimate a non-constant residual variance? Ex. 1: Ignore dependency of u’s and constraints on c’s: gen ord=_n xtmixed y x ||ord: x,noc

14 14. xtmixed y x ||ord: x,noc nolog Mixed-effects REML regression Number of obs = 1000 Group variable: ord Number of groups = 1000 Obs per group: min = 1 avg = 1.0 max = 1 Wald chi2(1) = 5745.72 Log restricted-likelihood = -1444.8061 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1.404595.0185301 75.80 0.000 1.368276 1.440913 _cons | 3.0182.0116741 258.54 0.000 2.99532 3.041081 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ ord: Identity | sd(x) |.5058616.0118386.4831824.5296053 -----------------------------+------------------------------------------------ sd(Residual) |.0495162.0143015.0281125.0872159 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 718.85 Prob >= chibar2 = 0.0000  0 = 3.0,  1 = 1.4, c 0 = 0.05, c 1 = 0.50

15 15 y i =  0 +  1 x i + c 1 z i u 1i Can we do this using -xtmixed- ? set obs 1000 gen x = 5*uniform() gen z = 3 + 1.4*x gen u1=invnorm(uniform()) gen y = 3 + 1.4*x + 0.50*z*u1 gen ord=_n xtmixed y x ||ord: z,noc How do we get –xtmixed- to estimate a non-constant residual variance? Ex. 2: No random intercept, covariate known:

16 16 Can we do this using -xtmixed- ? Performing EM optimization: Performing gradient-based optimization: Iteration 0: log restricted-likelihood = -2506.6255 numerical derivatives are approximate flat or discontinuous region encountered Iteration 1: log restricted-likelihood = -2503.1832 numerical derivatives are approximate Garbage! y i =  0 +  1 x i + c 1 z i u 1i How do we get –xtmixed- to estimate a non-constant residual variance? Ex. 2: No random intercept, covariate known: xtmixed y x ||ord: z,noc

17 17 expand 3 sort ord gen yf=y +.001*invnorm(uniform()) xtmixed yf x ||ord: z,noc nolog Can we do this using -xtmixed- ? How do we get –xtmixed- to estimate a non-constant residual variance? Ex. 2: No random intercept, covariate known: y i =  0 +  1 x i + c 1 z i u 1i

18 18. xtmixed yf x ||ord: z,noc nolog Mixed-effects REML regression Number of obs = 3000 Group variable: ord Number of groups = 1000 Obs per group: min = 3 avg = 3.0 max = 3 Wald chi2(1) = 484.13 Log restricted-likelihood = 7952.0717 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ yf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1.393598.063337 22.00 0.000 1.26946 1.517736 _cons | 3.071291.1252851 24.51 0.000 2.825737 3.316846 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ ord: Identity | sd(z) |.4814862.0107771.46082.5030792 -----------------------------+------------------------------------------------ sd(Residual) |.0009896.0000156.0009594.0010207 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 31334.04 Prob >= chibar2 = 0.0000  0 = 3.0,  1 = 1.4, c 1 = 0.50

19 19 Can we do this using -xtmixed- ? Degenerate dependency of random effects (u 0i = u 1i ). Coefficients of random intercept and slope (c 0 and c 1 ) need to be constrained. y i =  0 +  1 x i +  (  0 +  1 x i )u i =  0 +  1 x i + c 0 u 0i + c 1 x i u 1i where u 0i = u 1i and c 1 /c 0 =  1 /  0 Ex 3: No random intercept (unknown covariate)

20 20 Can we do this using -xtmixed- ? Degenerate dependency of random effects (u 0i = u 1i ). Coefficients of random intercept and slope (c 0 and c 1 ) need to be constrained. y i =  0 +  1 x i +  (  0 +  1 x i )u i =  0 +  1 x i + c 0 u 0i + c 1 x i u 1i where u 0i = u 1i and c 1 /c 0 =  1 /  0 Recast model with one error term and pretend z i =  0 +  1 x i is known. Then iterate. Ex 3: No random intercept (unknown covariate)

21 21 y i =  0 +  1 x i +  (  0 +  1 x i )u i =  0 +  1 x i + c 1 z i u 1i Can we do this using -xtmixed- ? 1.Expand and introduce artificial “residual” error term.expand 3.gen yf=y +.001*invnorm(uniform())

22 22 1.Expand and introduce artificial “residual” error term. 2. Estimate z i by OLS or other “easy” method. Can we do this using -xtmixed- ?.expand 3.gen yf=y +.001*invnorm(uniform()).reg y x.predict zh y i =  0 +  1 x i +  (  0 +  1 x i )u i =  0 +  1 x i + c 1 z i u 1i

23 23 1.Expand and introduce artificial “residual” error term. 2. Estimate z i by OLS or other “easy” method. 3. Fit model pretending prediction zh i is actual z i..expand 3.gen yf=y +.001*invnorm(uniform()).reg y x.predict zh.xtmixed yf x ||ord: zh,noc nolog y i =  0 +  1 x i + c 1 z i u 1i  z i =  0 +  1 x i is unknown] Can we do this using -xtmixed- ?

24 24.expand 3.gen yf=y +.001*invnorm(uniform()).reg y x.predict zh.xtmixed yf x ||ord: zh,noc nolog.drop zh.predict zh Can we do this using -xtmixed- ? y i =  0 +  1 x i + c 1 z i u 1i  z i =  0 +  1 x i is unknown] 1.Expand and introduce artificial “residual” error term. 2. Estimate z i by OLS or other “easy” method. 3. Fit model pretending prediction zh i is actual z i. 4. Iterate.

25 25. xtmixed yf x ||ord: zh,noc Mixed-effects REML regression Number of obs = 3000 Group variable: ord Number of groups = 1000 Obs per group: min = 3 avg = 3.0 max = 3 Wald chi2(1) = 484.01 Log restricted-likelihood = 7952.4049 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ yf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1.39339.0633354 22.00 0.000 1.269255 1.517525 _cons | 3.071699.1261407 24.35 0.000 2.824467 3.31893 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ ord: Identity | sd(zh) |.4764546.0106645.4560044.4978219 -----------------------------+------------------------------------------------ sd(Residual) |.0009896.0000156.0009594.0010207 ------------------------------------------------------------------------------ LR test vs. linear regression: chibar2(01) = 31334.71 Prob >= chibar2 = 0.0000  0 = 3.0,  1 = 1.4, c 1 = 0.50

26 26 args NS nr be0 be1 c1 c2 drop _all set obs `NS' gen id=_n gen u1 = invnorm(uniform()) expand `nr' sort id gen u2=invnorm(uniform()) gen x = 10*uniform() gen z = `be0' + `be1'*x gen zi = z + `c1'*z*u1 gen y = zi + `c2'*zi*u2 y ij =  0 +  1 x ij + c 1 (  0 +  1 x ij )u 1i + c 2 [  0 +  1 x ij + c 1 (  0 +  1 x ij )u 1i ]u 2i 2-level model E(y ij | x ij ) E(y ij | x ij, u 1i ) [“z”] [“z i ”]

27 27 2-level model (example)

28 28 //[gen y = zi + `c2'*zi*e] gen obs=_n expand 3 sort obs gen yf = y +.001*invnorm(uniform()) xtmixed y x ||id: x,noc nolog predict zh0 predict uh1i_0,reffects level(id) gen zhi_0 = zh0 + uh1i_0 xtmixed yf x ||id: zh0,noc ||obs: zhi_0,noc nolog predict zh1 predict uh1i_1,reffects level(id) gen zhi_1 = zh1 + uh1i_1 xtmixed yf x ||id: zh1,noc ||obs: zhi_1,noc nolog predict zh2 predict zhi_2,reffects level(id) gen zhi_2 = zh2 + uh1i_2 noi xtmixed yf x ||id: zh2,noc ||obs: zhi_2,noc nolog 0 1 2 3

29 29. run nasug_2008_sim1 20 5 1.0 1.0.2.05 6 1.21211063.05062864.21216237.05076224.21213417.05075685.21213363.05075686.21213354.05075685.21213353.05075685 Mixed-effects REML regression Number of obs = 300 ----------------------------------------------------------- | No. of Observations per Group Group Variable | Groups Minimum Average Maximum ----------------+------------------------------------------ id | 20 15 15.0 15 obs | 100 3 3.0 3 -----------------------------------------------------------

30 30. run nasug_2008_sim1 20 5 1.0 1.0.2.05 6 1 Wald chi2(1) = 421.17 Log restricted-likelihood = 1038.0836 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ yf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1.087757.0530034 20.52 0.000.9838727 1.191642 _cons | 1.039358.0535761 19.40 0.000.9343512 1.144366 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity | sd(zh) |.2121335.0348399.1537487.2926895 -----------------------------+------------------------------------------------ obs: Identity | sd(zhi) |.0507568.0040389.0434271.0593237 -----------------------------+------------------------------------------------ sd(Residual) |.0009429.0000471.0008549.00104 ------------------------------------------------------------------------------ LR test vs. linear regression: chi2(2) = 2866.48 Prob > chi2 = 0.0000

31 31  c2c2  c1c1 Bayesian Estimation (WINBUGS)

32 32 WINBUGS STATA (xtmixed) node mean sd 2.5%median 97.5% start sample be11.0770.04988 0.98471.076 1.169 10001 10000 be01.030.05257 0.93171.03 1.131 10001 10000 c10.2170.03455 0.16180.2127 0.2958 10001 10000 c20.0640.00465 0.05560.0637 0.07353 10001 10000 yf | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | 1.087757.0530034 20.52 0.000.9838727 1.191642 _cons| 1.039358.0535761 19.40 0.000.9343512 1.144366 ------------------------------------------------------------------------------ Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -----------------------------+------------------------------------------------ id: Identity | sd(xb) |.2121335.0348399.1537487.2926895 -----------------------------+------------------------------------------------ obs: Identity | s d(muhi) |.0507568.0040389.0434271.0593237 -----------------------------+------------------------------------------------


Download ppt "1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL."

Similar presentations


Ads by Google