The data set used in this section is kenkel.dat. We would like to thank the Journal of Applied Econometrics for making these data used in Kenkel and Terza (2001) available at Journal of Applied Econometrics Data Archive.
The do-file is kenkel.do.
The programs used are gllamm and ssm.
Use the command ssc describe gllamm and ssc describe ssm and follow instructions to download the programs. For more information on gllamm see http://www.gllamm.org and for more information on ssm see http://www.gllamm.org/wrappers.html.
ssm drinks advice black hieduc, s(advice = black hieduc hlthins regmed heart) /* */ adapt q(16) family(poiss) link(log)
The program used is gllamm. The first three models can also be estimated using Stata's own commands poisson, xtpoisson and probit, but we use gllamm throughout to make the syntax of the final model easier to understand.
Use the command ssc describe gllamm and follow instructions to download gllamm. For more information on gllamm see http://www.gllamm.org.
Read data and display first ten records:
insheet using kenkel.dat, clear list in 1/10, clean . list in 1/10, clean drinks advice black hlthins regmed heart hieduc 1. 60 0 0 1 1 0 1 2. 84 0 0 1 1 0 0 3. 4 0 0 1 1 0 1 4. 0 1 0 1 0 0 1 5. 12 0 0 1 1 0 0 6. 12 0 0 1 1 0 0 7. 5 0 0 1 1 0 1 8. 7 0 0 1 1 0 0 9. 0 0 0 1 0 0 1 10. 1.5 1 0 1 1 1 1
The dependent variable for the analysis is the number of alcoholic beverages consumed in the last two weeks. This is calculated as the product of self-reported drinking frequency (the number of days in the past two weeks with any drinking) and drinking intensity (the average number of drinks on a day with any drinking). We round this to the nearest integer to obtain a proper count.
replace drinks=round(drinks,1)
Collapse data and generate frequency weight variable wt2 to speed up estimation. The gllamm option weight(wt) will ensure that the data are weighted appropriately.
disp _N 2467 gen one=1 collapse (sum) wt2=one, by(black hlthins regmed heart hieduc drinks advice) disp _N 737 gen id=_n gen cons=1 list in 1/10, clean drinks advice black hlthins regmed heart hieduc wt2 id cons 1. 0 0 0 0 0 0 0 7 1 1 2. 0 1 0 0 0 0 0 3 2 1 3. 1 0 0 0 0 0 0 1 3 1 4. 1 1 0 0 0 0 0 1 4 1 5. 2 0 0 0 0 0 0 2 5 1 6. 2 1 0 0 0 0 0 3 6 1 7. 3 0 0 0 0 0 0 1 7 1 8. 4 0 0 0 0 0 0 4 8 1 9. 5 0 0 0 0 0 0 1 9 1 10. 6 0 0 0 0 0 0 1 10 1
Poisson model for drinking for Table 14.7:
gllamm drinks advice cons hieduc black, i(id) weight(wt) family(poisson) link(log) /* */ nocons init number of level 1 units = 2467 Condition Number = 3.8352842 gllamm model log likelihood = -32939.148 ------------------------------------------------------------------------------ drinks | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- advice | .473367 .010918 43.36 0.000 .4519681 .4947658 cons | 2.650541 .0084928 312.09 0.000 2.633896 2.667187 hieduc | -.1826093 .0107983 -16.91 0.000 -.2037736 -.1614451 black | -.3096866 .0168905 -18.33 0.000 -.3427914 -.2765818 ------------------------------------------------------------------------------
Overdispersed Poisson model for drinking for Table 14.7 (iteration log not shown):
gllamm drinks advice cons hieduc black, i(id) weight(wt) family(poisson) link(log) /* */ nocons adapt nip(10) number of level 1 units = 2467 number of level 2 units = 2467 Condition Number = 3.9794068 gllamm model log likelihood = -8857.8425 ------------------------------------------------------------------------------ drinks | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- advice | .5954294 .0815787 7.30 0.000 .435538 .7553207 cons | 1.429521 .0600222 23.82 0.000 1.31188 1.547162 hieduc | .0277592 .0740012 0.38 0.708 -.1172806 .172799 black | -.2835224 .1090915 -2.60 0.009 -.4973378 -.069707 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (id) var(1): 2.899832 (.1132476) ------------------------------------------------------------------------------
Probit model for advice for Table 14.7 (iteration log not shown):
gllamm advice cons hieduc black hlthins regmed heart, i(id) weight(wt) family(binomial) /* */ link(probit) nocons init number of level 1 units = 2467 Condition Number = 6.4702064 gllamm model log likelihood = -1419.9041 ------------------------------------------------------------------------------ advice | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- cons | -.4785403 .0849039 -5.64 0.000 -.644949 -.3121317 hieduc | -.2520195 .0560241 -4.50 0.000 -.3618247 -.1422143 black | .3031406 .0780889 3.88 0.000 .1500891 .4561921 hlthins | -.2708712 .0704249 -3.85 0.000 -.4089013 -.132841 regmed | .1801328 .0738763 2.44 0.015 .0353379 .3249278 heart | .1661613 .0757854 2.19 0.028 .0176246 .314698 ------------------------------------------------------------------------------
Prepare data for endogeneous treatment model for Table 14.7:
stack drinking and advice into single variable resp and create a variable type = 1 for drinking and type = 2 for advice
rename drinks resp1 gen resp2 = advice reshape long resp, i(id) j(type) (note: j = 1 2) Data wide -> long ----------------------------------------------------------------------------- Number of obs. 737 -> 1474 Number of variables 11 -> 11 j variable (2 values) -> type xij variables: resp1 resp2 -> resp -----------------------------------------------------------------------------
Create dummies d1 for type=1 (drinking) and d2 for type=2 (advice).
tab type, gen(d) sort id type list in 1/10, clean id type advice black hlthins regmed heart hieduc wt2 cons resp d1 d2 1. 1 1 0 0 0 0 0 0 7 1 0 1 0 2. 1 2 0 0 0 0 0 0 7 1 0 0 1 3. 2 1 1 0 0 0 0 0 3 1 0 1 0 4. 2 2 1 0 0 0 0 0 3 1 1 0 1 5. 3 1 0 0 0 0 0 0 1 1 1 1 0 6. 3 2 0 0 0 0 0 0 1 1 0 0 1 7. 4 1 1 0 0 0 0 0 1 1 1 1 0 8. 4 2 1 0 0 0 0 0 1 1 1 0 1 9. 5 1 0 0 0 0 0 0 2 1 2 1 0 10. 5 2 0 0 0 0 0 0 2 1 0 0 1
Create interactions between d1 and covariates in drining model:
gen d1_advice = d1*advice gen d1_hieduc = d1*hieduc gen d1_black = d1*black
Create interactions between d2 and covariates in advice model (use foreach to save typing):
foreach var in hieduc black hlthins regmed heart { gen d2_`var' = d2*`var' }
Endogenous treatment model for Table 14.7.
eq fac: d1 d2 gllamm resp d1_advice d1 d1_hieduc d1_black d2 d2_hieduc d2_black d2_hlthins /* */ d2_regmed d2_heart, nocons i(id) weight(wt) family(poisson binom) /* */ link(log probit) fv(type) lv(type) eq(fac) adapt nip(15) number of level 1 units = 4934 number of level 2 units = 2467 Condition Number = 23.071575 gllamm model log likelihood = -10254.241 ------------------------------------------------------------------------------ resp | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1_advice | -2.412288 .2308401 -10.45 0.000 -2.864727 -1.95985 d1 | 2.323782 .0932252 24.93 0.000 2.141064 2.5065 d1_hieduc | -.2842432 .098614 -2.88 0.004 -.477523 -.0909633 d1_black | .1103129 .1439681 0.77 0.444 -.1718595 .3924853 d2 | -1.117855 .1629308 -6.86 0.000 -1.437194 -.7985166 d2_hieduc | -.3988385 .104094 -3.83 0.000 -.6028591 -.1948179 d2_black | .6040407 .1528167 3.95 0.000 .3045254 .903556 d2_hlthins | -.3270483 .0957591 -3.42 0.001 -.5147327 -.1393639 d2_regmed | .3851485 .1000492 3.85 0.000 .1890557 .5812413 d2_heart | .5077518 .1105428 4.59 0.000 .2910919 .7244118 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (id) var(1): 2.4751449 (.69051946) loadings for random effect 1 d2: 1 (fixed) d1: 1.4303603 (.15172818) ------------------------------------------------------------------------------
There are some small discrepancies between these estimates and Table 14.7.
Kenkel, D. S. and Terza, J. V. (2001). The effect of physician advice on alcohol consumption: Count regression with an endogenous treatment effect. Journal of Applied Econometrics 16, 165-184.
Miranda, A. and Rabe-Hesketh, S. (2005). Maximum likelihood estimation of endogenous switching and sample selection models for binary, count, and ordinal variables. Submitted for publication.
Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2002). Reliable estimation of generalised linear mixed models using adaptive quadrature. The Stata Journal 2, 1-21.
Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized
Latent Variable Modeling: Multilevel, Longitudinal and Structural
Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.
Outline
Datasets and do-files