The data set used in this section is xerop.dat. We thank Al Sommer, Keith West, Joanne Katz, Scott Zeger and Patrick Heagerty for allowing us to make the data available.
The do-file is ichs.do.
The programs we use are xtgee, gllamm, gllasim and gllapred. You can find the last three programs and download them by issuing the command ssc describe gllamm and following instructions; see here for more information on installing the GLLAMM programs. For more information on these commands see http://www.gllamm.org.
Read and prepare the data
infile id resp cons age xero cosine sine female height stunted time age1 season time2 using xerop.dat list id time age xero cosine sine female height stunted in 1/20, clean id time resp age xero cosine sine female height stunted 1. 121013 1 0 31 0 -1 0 0 -3 0 2. 121013 2 0 34 0 0 -1 0 -3 0 3. 121013 3 0 37 0 1 0 0 -2 0 4. 121013 4 0 40 0 0 1 0 -2 0 5. 121013 5 1 43 0 -1 0 0 -2 0 6. 121013 6 0 46 0 0 -1 0 -3 0 7. 121113 1 0 -9 0 -1 0 1 2 0 8. 121113 2 0 -6 0 0 -1 1 0 0 9. 121113 3 0 -3 0 1 0 1 -1 0 10. 121113 4 0 0 0 0 1 1 -2 0 11. 121113 5 1 3 0 -1 0 1 -3 0 12. 121113 6 0 6 0 0 -1 1 -3 0 13. 121114 1 0 -26 0 -1 0 0 8 0 14. 121114 2 0 -23 0 0 -1 0 5 0 15. 121114 3 0 -20 0 1 0 0 3 0 16. 121114 4 1 -17 0 0 1 0 0 0 17. 121114 5 1 -14 0 -1 0 0 0 0 18. 121114 6 0 -11 0 0 -1 0 0 0 19. 121140 1 0 -19 0 -1 0 1 5 0 20. 121140 2 0 -16 1 0 -1 1 4 0
Create indicator for last observation for each person for later use
sort id time qui by id: gen last=_n==_N
Generalized estimating equations (GEE) using xtgee for Table 9.1, saving estimates using _estimates hold for later use (Figure 9.1). Iteration logs are not shown.
xtgee resp age xero cosine sine female height stunted, i(id) corr(exch) l(logit) f(binom) robust _estimates hold gee GEE population-averaged model Number of obs = 1200 Group variable: id Number of groups = 275 Link: logit Obs per group: min = 1 Family: binomial avg = 4.4 Correlation: exchangeable max = 6 Wald chi2(7) = 42.98 Scale parameter: 1 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust resp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0318552 .0062079 -5.13 0.000 -.0440225 -.019688 xero | .6211243 .4409799 1.41 0.159 -.2431803 1.485429 female | -.418182 .2367913 -1.77 0.077 -.8822843 .0459203 cosine | -.5680533 .1695157 -3.35 0.001 -.9002979 -.2358086 sine | -.1622827 .1461244 -1.11 0.267 -.4486812 .1241158 height | -.0475909 .0307661 -1.55 0.122 -.1078914 .0127096 stunted | .1492226 .4108152 0.36 0.716 -.6559605 .9544056 _cons | -2.421176 .1781838 -13.59 0.000 -2.77041 -2.071942 ------------------------------------------------------------------------------
GEE with exponentiated coefficients (population averaged odds ratios)
xtgee resp age xero cosine sine female height stunted, i(id) corr(exch) l(logit) f(binom) robust eform GEE population-averaged model Number of obs = 1200 Group variable: id Number of groups = 275 Link: logit Obs per group: min = 1 Family: binomial avg = 4.4 Correlation: exchangeable max = 6 Wald chi2(7) = 42.98 Scale parameter: 1 Prob > chi2 = 0.0000 (standard errors adjusted for clustering on id) ------------------------------------------------------------------------------ | Semi-robust resp | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .9686468 .0060133 -5.13 0.000 .9569324 .9805046 xero | 1.861019 .820672 1.41 0.159 .7841301 4.41686 female | .6582424 .155866 -1.77 0.077 .4138365 1.046991 cosine | .5666274 .0960522 -3.35 0.001 .4064486 .7899318 sine | .8502008 .1242351 -1.11 0.267 .6384696 1.132147 height | .9535238 .0293362 -1.55 0.122 .8977251 1.012791 stunted | 1.160931 .4769283 0.36 0.716 .5189434 2.597126 ------------------------------------------------------------------------------
Random intercept model logistic regression model for Table 9.1 using gllamm (iteration log not shown).
gllamm resp age xero cosine sine female height stunted, i(id) l(logit) f(binom) nip(12) adapt number of level 1 units = 1200 number of level 2 units = 275 Condition Number = 75.756193 gllamm model log likelihood = -334.64731 ------------------------------------------------------------------------------ resp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | -.0339926 .0073209 -4.64 0.000 -.0483413 -.0196438 xero | .6243137 .4803223 1.30 0.194 -.3171007 1.565728 female | -.4363833 .2576521 -1.69 0.090 -.9413721 .0686055 cosine | -.5938483 .1741618 -3.41 0.001 -.9351991 -.2524975 sine | -.1648182 .1747285 -0.94 0.346 -.5072797 .1776434 height | -.0480044 .0267659 -1.79 0.073 -.1004646 .0044557 stunted | .202302 .4414687 0.46 0.647 -.6629608 1.067565 _cons | -2.673176 .2237101 -11.95 0.000 -3.11164 -2.234712 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (id) var(1): .6493042 (.34913418) ------------------------------------------------------------------------------
Exponentiated coefficients (subject-specific odds ratios) using gllamm
gllamm, eform number of level 1 units = 1200 number of level 2 units = 275 Condition Number = 75.756193 gllamm model log likelihood = -334.64731 ------------------------------------------------------------------------------ resp | exp(b) Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- age | .9665787 .0070762 -4.64 0.000 .9528085 .9805479 xero | 1.866964 .8967445 1.30 0.194 .7282574 4.786158 female | .6463699 .1665386 -1.69 0.090 .3900922 1.071014 cosine | .5521982 .0961718 -3.41 0.001 .3925077 .7768582 sine | .8480479 .1481781 -0.94 0.346 .6021313 1.194399 height | .9531296 .0255113 -1.79 0.073 .9044172 1.004466 stunted | 1.224218 .5404538 0.46 0.647 .5153233 2.908288 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (id) var(1): .6493042 (.34913418) ------------------------------------------------------------------------------
Empirical Bayes predictions using gllapred (to be stored in ebm1)
gllapred eb, u (means and standard deviations will be stored in ebm1 ebs1) Non-adaptive log-likelihood: -334.64731 -334.6473 -334.6473 log-likelihood:-334.64731
Simulate three datasets using gllasim and obtain empirical Bayes predictions for Figure 9.2 (output not shown)
set seed 1322411 matrix aa=e(b) gllasim y, from(aa) gllamm y age xero cosine sine female height stunted, i(id) l(logit) f(binom) nip(12) from(aa) copy adapt gllapred SIM1, u drop y gllasim y, from(aa) gllamm y age xero cosine sine female height stunted, i(id) l(logit) f(binom) nip(12) from(aa) copy adapt gllapred SIM2, u drop y gllasim y, from(aa) gllamm y age xero cosine sine female height stunted, i(id) l(logit) f(binom) nip(12) from(aa) copy adapt gllapred SIM3, u drop y
Make Figure 9.2 and store as ichs2.gph (note that simulated datasets are different due to different seed)
graph box ebm1 SIM1m1 SIM2m1 SIM3m1 if last==1, t2(" ") saving(ichs2, replace) /* */ legend(lab(1 "Data") lab(2 "Sim 1") lab(3 "Sim 2") lab(4 "Sim 3") ) /* */ marker(1, m(oh)) marker(2, m(oh)) marker(3, m(oh)) marker(4, m(oh)) lintensity(255)
Make Figure 9.3 and store as ichs3.gph
sort ebm1 gen rank = sum(last) serrbar ebm1 ebs1 rank if last==1&mod(rank,6)==5, xlab(5 100 200 275) /* */ ytitle("Predicted random intercept") saving(ichs3, replace)
Create small dataset with covariate values for which we want predictions for Figure 9.1
keep in 1/83 replace age = -32+_n-1 replace cosine=-1 replace sine= 0 replace xero=0 replace female=1 summ height replace height=r(mean) replace stunted=0 gen y=1
Obtain marginal and conditional probabilities for random intercept model using gllapred
gllapred mu_m, mu marg from(aa) gen u01=0 gen u8p1 = 0.8 gen u8m1 = -0.8 gllapred mu_0, mu us(u0) gllapred mu_8p, mu us(u8p) gllapred mu_8m, mu us(u8m)
Obtain marginal probabilities for GEE by restoring the stored estimates using _estimates unhold and using Stata's predict command.
_estimates unhold gee predict mu_g, mu label variable mu_g "GEE" label variable mu_0 "u=0" label variable mu_8p "u=+0.8" label variable mu_8m "u=-0.8" label variable mu_m "pop. average"
Make Figure 9.1 (ichs.gph)
sort age twoway (line mu_m age, clpatt(solid)) /* */ (scatter mu_g age if mod(age,2)==1, ms(oh) mc(black)) (line mu_0 age, clpatt(dash)) /* */ (line mu_8p age, clpatt(dash)) (line mu_8m age, clpatt(dash)), /* */ legend(order(1 2 3) label(3 "conditional") position(1) ring(0) ) /* */ ytitle("Probability of infection") /* */ xtitle("Age in months") saving(ichs, replace)
Zeger, S. L. and Karim, M. R. (1991). Generalized linear models with random effects: A Gibbs sampling approach. Journal of the American Statistical Association 86, 79-86.
Diggle, P. J., Heagerty, P. J., Liang, K.-Y. and Zeger, S. L. (2002). Analysis of Longitudinal Data. Oxford: Oxford University Press.
Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalised linear mixed models using adaptive quadrature. The Stata Journal 2, 1-21.
Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized
Latent Variable Modeling: Multilevel, Longitudinal and Structural
Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.
Outline
Datasets and do-files