The data set used in this section is mislevy.dat. Below we assume that this has been saved in the current directory.
The do-file is mislevy.do.
The programs we use are gllamm and gllapred. You can find the programs and download them by issuing the command findit gllamm and findit gllapred. For more information see http://www.gllamm.org.
Read and prepare the data
insheet using mislevy.dat, clear list, clean y1 y2 y3 y4 cwm cwf cbm cbf 1. 0 0 0 0 23 20 27 29 2. 0 0 0 1 5 8 5 8 3. 0 0 1 0 12 14 15 7 4. 0 0 1 1 2 2 3 3 5. 0 1 0 0 16 20 16 14 6. 0 1 0 1 3 5 5 5 7. 0 1 1 0 6 11 4 6 8. 0 1 1 1 1 7 3 0 9. 1 0 0 0 22 23 15 14 10. 1 0 0 1 6 8 10 10 11. 1 0 1 0 7 9 8 11 12. 1 0 1 1 19 6 1 2 13. 1 1 0 0 21 18 7 19 14. 1 1 0 1 11 15 9 5 15. 1 1 1 0 23 20 10 8 16. 1 1 1 1 86 42 2 4
Stack variables cwm, cwf, cbm and cbf into a single frequency variable wt2 and create dummies w for white and m for male
gen i=_n reshape long cw cb,i(i) j(male) string replace i=_n reshape long c, i(i) j(white) string drop i encode white, gen(w) encode male, gen(m) replace w=w-1 replace m=m-1 rename c wt2 list in 1/10, clean nolab white male y1 y2 y3 y4 wt2 w m 1. b f 0 0 0 0 29 0 0 2. w f 0 0 0 0 20 1 0 3. b m 0 0 0 0 27 0 1 4. w m 0 0 0 0 23 1 1 5. b f 0 0 0 1 8 0 0 6. w f 0 0 0 1 8 1 0 7. b m 0 0 0 1 5 0 1 8. w m 0 0 0 1 5 1 1 9. b f 0 0 1 0 7 0 0 10. w f 0 0 1 0 14 1 0
Calculate tot, the sizes of the four groups defined by w and m
egen tot = sum(wt2), by(w m)
Stack responses y1 to y4 into a single vector and create variable item
gen patt=_n reshape long y, i(patt) j(item) list in 1/8, clean nolab patt item white male y wt2 w m 1. 1 1 b f 0 29 0 0 2. 1 2 b f 0 29 0 0 3. 1 3 b f 0 29 0 0 4. 1 4 b f 0 29 0 0 5. 2 1 w f 0 20 1 0 6. 2 2 w f 0 20 1 0 7. 2 3 w f 0 20 1 0 8. 2 4 w f 0 20 1 0
Create dummy variables d1 to d4 for items 1 to 4
qui tab item, gen(d)
Estimate the one-parameter logistic IRT model (Table 9.5)
gllamm y d1 d2 d3 d4, i(patt) l(logit) f(binom) weight(wt) nocons adapt number of level 1 units = 3104 number of level 2 units = 776 Condition Number = 1.6838733 gllamm model log likelihood = -2004.9379 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1 | .5775969 .0969974 5.95 0.000 .3874854 .7677083 d2 | .2382793 .0950415 2.51 0.012 .0520014 .4245572 d3 | -.2247582 .0949752 -2.37 0.018 -.4109062 -.0386102 d4 | -.5938583 .0971079 -6.12 0.000 -.7841863 -.4035303 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (patt) var(1): 1.6285398 (.20840709) ------------------------------------------------------------------------------
Plot the item characteristic curves (Figure 9.5)
matrix list e(b) e(b)[1,5] y: y: y: y: patt1: d1 d2 d3 d4 _cons y1 .57759686 .23827933 -.22475821 -.59385829 1.2761425 twoway (function y=1/(1+exp(-[y]d1 -x*[patt1]_cons)), range(-2.5 2.5)) /* */ (function y=1/(1+exp(-[y]d2 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(dot)) /* */ (function y=1/(1+exp(-[y]d3 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(dash)) /* */ (function y=1/(1+exp(-[y]d4 -x*[patt1]_cons)), range(-2.5 2.5) clpatt(longdash)), /* */ legend( label(1 "Item 1") label(2 "Item 2") label(3 "Item 3") label(4 "Item 4") ) /* */ xtitle(Ability) ytitle(Probability of correct answer)
eq load: d1-d4 gllamm y d1-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) nocons adapt nip(12) number of level 1 units = 3104 number of level 2 units = 776 Condition Number = 5.4532607 gllamm model log likelihood = -2002.7391 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- d1 | .6453275 .1206319 5.35 0.000 .4088932 .8817617 d2 | .2194106 .0890237 2.46 0.014 .0449274 .3938939 d3 | -.2156426 .0908816 -2.37 0.018 -.3937672 -.037518 d4 | -.6251801 .109345 -5.72 0.000 -.8394923 -.4108678 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (patt) var(1): 2.6007398 (.87041805) loadings for random effect 1 d1: 1 (fixed) d2: .64650121 (.15490565) d3: .69484866 (.16864938) d4: .89729467 (.21828051) ------------------------------------------------------------------------------
The variance and factor loading estimates differ a little from those in Table 9.5.
Plot item characteristic curves (Figure 9.5)
matrix list e(b) e(b)[1,8] y: y: y: y: pat1_1l: pat1_1l: pat1_1l: d1 d2 d3 d4 d2 d3 d4 y1 .64532747 .21941063 -.21564261 -.62518007 .64650121 .69484866 .89729467 pat1_1: d1 y1 1.6126809 twoway (function y=1/(1+exp(-[y]d1 -x*[pat1_1]d1)), range(-2.5 2.5)) /* */ (function y=1/(1+exp(-[y]d2 -x*[pat1_1]d1*[pat1_1l]d2)), range(-2.5 2.5) clpatt(dot)) /* */ (function y=1/(1+exp(-[y]d3 -x*[pat1_1]d1*[pat1_1l]d3)), range(-2.5 2.5) clpatt(dash)) /* */ (function y=1/(1+exp(-[y]d4 -x*[pat1_1]d1*[pat1_1l]d4)), range(-2.5 2.5) clpatt(longdash)), /* */ legend( label(1 "Item 1") label(2 "Item 2") label(3 "Item 3") label(4 "Item 4") ) /* */ xtitle(Ability) ytitle(Probability of correct answer)
Estimate two-parameter IRT model with non-zero mean ability, setting the item difficulty of item 1 to zero (Table 9.6)
gen cons=1 eq load: d1-d4 eq f1: cons gllamm y d2-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) /* */ geqs(f1) nocons adapt nip(12) number of level 1 units = 3104 number of level 2 units = 776 Condition Number = 6.9163564 gllamm model log likelihood = -2002.7391 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- d2 | -.197791 .1174163 -1.68 0.092 -.4279228 .0323408 d3 | -.6640437 .1352743 -4.91 0.000 -.9291764 -.398911 d4 | -1.204219 .1846912 -6.52 0.000 -1.566207 -.8422309 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (patt) var(1): 2.6008217 (.87044376) loadings for random effect 1 d1: 1 (fixed) d2: .64648882 (.15490305) d3: .69483564 (.1686465) d4: .89727244 (.21827359) Regressions of latent variables on covariates ------------------------------------------------------------------------------ random effect 1 has 1 covariates: cons: .64533407 (.12063317) ------------------------------------------------------------------------------
The estimates differ a little from those in Table 9.6.
Empirical Bayes predictions: EAP ability scores
gllapred IRT, fac (means and standard deviations will be stored in IRTm1 IRTs1)
Estimate a MIMIC model where ability depends on sex (dummy f), race (dummy b) and their interaction (Table 9.6)
gen f=1-m gen b=1-w gen b_f = b*f eq f1: cons f b b_f matrix a=e(b) gllamm y d2-d4, i(patt) eqs(load) l(logit) f(binom) weight(wt) /* */ geqs(f1) from(a) nocons adapt nip(12) number of level 1 units = 3104 number of level 2 units = 776 Condition Number = 10.407055 gllamm model log likelihood = -1956.2333 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- d2 | -.2114786 .1159544 -1.82 0.068 -.4387451 .015788 d3 | -.7145968 .1379683 -5.18 0.000 -.9850096 -.4441839 d4 | -1.159195 .1601593 -7.24 0.000 -1.473101 -.8452882 ------------------------------------------------------------------------------ Variances and covariances of random effects ------------------------------------------------------------------------------ ***level 2 (patt) var(1): 1.9689171 (.60516363) loadings for random effect 1 d1: 1 (fixed) d2: .67658546 (.14380046) d3: .77264707 (.16674369) d4: .86240868 (.17342379) Regressions of latent variables on covariates ------------------------------------------------------------------------------ random effect 1 has 4 covariates: cons: 1.4345373 (.21553973) f: -.6200911 (.20526977) b: -1.6843558 (.31129702) b_f: .67057381 (.32116754) ------------------------------------------------------------------------------
The estimates differ a little from those in Table 9.6.
Empirical Bayes predictions: EAP ability scores
gllapred MIMIC, fac (means and standard deviations will be stored in MIMICm1 MIMICs1)
Look at ability scores for each response and covariate pattern (Table 9.7)
drop d1-d4 reshape wide y, i(patt) j(item) sort y1-y4 b f list y1-y4 b f IRTm1 MIMICm1, nolab clean y1 y2 y3 y4 b f IRTm1 MIMICm1 1. 0 0 0 0 0 0 -1.2242641 -.55473277 2. 0 0 0 0 0 1 -1.2242641 -.87608943 3. 0 0 0 0 1 0 -1.2242641 -1.4707461 4. 0 0 0 0 1 1 -1.2242641 -1.4410752 5. 0 0 0 1 0 0 -.14876 .26704248 6. 0 0 0 1 0 1 -.14876 -.02615521 7. 0 0 0 1 1 0 -.14876 -.54782301 8. 0 0 0 1 1 1 -.14876 -.52233467 9. 0 0 1 0 0 0 -.37675812 .18398721 10. 0 0 1 0 0 1 -.37675812 -.11085131 11. 0 0 1 0 1 0 -.37675812 -.63777142 12. 0 0 1 0 1 1 -.37675812 -.6119613 13. 0 0 1 1 0 0 .60364936 .97836482 14. 0 0 1 1 0 1 .60364936 .68778691 15. 0 0 1 1 1 0 .60364936 .19041653 16. 0 0 1 1 1 1 .60364936 .21416676 17. 0 1 0 0 0 0 -.43218061 .09469579 18. 0 1 0 0 0 1 -.43218061 -.20221901 19. 0 1 0 0 1 0 -.43218061 -.73534825 20. 0 1 0 0 1 1 -.43218061 -.70916483 21. 0 1 0 1 0 0 .55196233 .8893892 22. 0 1 0 1 0 1 .55196233 .59958378 23. 0 1 0 1 1 0 .55196233 .10115863 24. 0 1 0 1 1 1 .55196233 .12502831 25. 0 1 1 0 0 0 .33515561 .80656055 26. 0 1 1 0 0 1 .33515561 .51719789 27. 0 1 1 0 1 0 .33515561 .01729505 28. 0 1 1 0 1 1 .33515561 .0413004 29. 0 1 1 1 0 0 1.3035676 1.6228636 30. 0 1 1 1 0 1 1.3035676 1.3180372 31. 0 1 1 1 1 0 1.3035676 .81295145 32. 0 1 1 1 1 1 1.3035676 .83659008 33. 1 0 0 0 0 0 -.0351866 .3938274 34. 1 0 0 0 0 1 -.0351866 .10259226 35. 1 0 0 0 1 0 -.0351866 -.41203827 36. 1 0 0 0 1 1 -.0351866 -.38699316 37. 1 0 0 1 0 0 .93089004 1.1908602 38. 1 0 0 1 0 1 .93089004 .89722329 39. 1 0 0 1 1 0 .93089004 .40020559 40. 1 0 0 1 1 1 .93089004 .42377722 41. 1 0 1 0 0 0 .71352262 1.1065918 42. 1 0 1 0 0 1 .71352262 .81436966 43. 1 0 1 0 1 0 .71352262 .31757081 44. 1 0 1 0 1 1 .71352262 .34119563 45. 1 0 1 1 0 0 1.7045824 1.9484446 46. 1 0 1 1 0 1 1.7045824 1.6312163 47. 1 0 1 1 1 0 1.7045824 1.113084 48. 1 0 1 1 1 1 1.7045824 1.1371117 49. 1 1 0 0 0 0 .66179647 1.0169609 50. 1 1 0 0 0 1 .66179647 .72595346 51. 1 1 0 0 1 0 .66179647 .22887171 52. 1 1 0 0 1 1 .66179647 .25257845 53. 1 1 0 1 0 0 1.6485285 1.8501881 54. 1 1 0 1 0 1 1.6485285 1.5370316 55. 1 1 0 1 1 0 1.6485285 1.0234149 56. 1 1 0 1 1 1 1.6485285 1.0472972 57. 1 1 1 0 0 0 1.4181578 1.7596042 58. 1 1 1 0 0 1 1.4181578 1.4499547 59. 1 1 1 0 1 0 1.4181578 .9400666 60. 1 1 1 0 1 1 1.4181578 .96383584 61. 1 1 1 1 0 0 2.506591 2.6876937 62. 1 1 1 1 0 1 2.506591 2.3322937 63. 1 1 1 1 1 0 2.506591 1.7665629 64. 1 1 1 1 1 1 2.506591 1.7923462
The scores differ a little from those in Table 9.7.
Mislevy, R. J. (1985). Estimation of latent group effects. Journal of the American Statistical Association 80, 993-997.
Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized
Latent Variable Modeling: Multilevel, Longitudinal and Structural
Equation Models. Boca Raton, FL: Chapman & Hall/ CRC Press.
Outline
Datasets and do-files