FAQs: Fitting binary IRT models in gllamm

www.gllamm.org

How do I fit IRT models for binary responses?

Title		Fitting binary IRT models in gllamm
Author		Minjeong Jeon, University of California, Berkeley
Date		July 2012

Item response data are for test or questionnaire data with responses y_ij to I items i by J persons j. It is assumed that a continuous latent trait θ_j, such as ability in the case of test items, explains the item responses via a model such as

logit[P(y_ij=1|θ_j)] = θ_j - β_i
where β_i is the item difficulty. The model has one parameter per item and is called a one-parameter logistic item response model. A discrimination parameter λ_i is sometimes introduced to allow the effect of the latent trait on the log-odds of a correct response to differ between items,

logit[P(y_ij=1|θ_j)] = λ_iθ_j - β_i

Data preparation

The data must be in long form, with all y_ij for persons j and items i in one variable. The data also require person and item identifiers and item indicator (or dummy) variables. For instance, if there are two students in school 1 who answered two items. Your data should look like this:

pid   item  y   i1  i2
1       1   0   1   0
1       2   1   0   1
1       1   1   1   0
2       2   0   0   1
...

pid is the person identifier and item is the item identifier. y represents item responses and i1 and i2 are two item indicator (or dummy) variables.

One-parameter IRT models in gllamm

Suppose there are 5 binary items in the data. The syntax for fitting one-parameter IRT model is

gllamm y i1-i5, i(pid) family(bin) link(logit) noconstant  adapt

i() specifies the person identifier.

The family() and link() options specify the conditional distribution of the responses and the link function. I used the logit link for the example binary data, to specify a one-parameter logistic item response model. One can choose logit, probit, or cll (complementary log-log) links for binary responses.

The noconstant option means that we omit the constant in the model so that all the 5 item dummy variables can be used as predictors.

The adapt option means that we use the adaptive quadrature method.

Two-parameter IRT models in gllamm

To fit two-parameter IRT models, we need to specify equations for the discrimination parameters, which are the factor loadings for latent variables (or person random effects or abilities). The syntax for fitting two-parameter IRT model is

eq load: i1-i5
gllamm y i1-i5, i(pid) family(bin) link(logit) noconstant eqs(load) nip(6) adapt

Note that in the first line we define an equation, named load for items 1 to 5 using the eq command. And then the eqs() option is used to specify the variables i1 to i5 in the linear combination of variables that multiplies the latent variable in the model.

To speed up estimation, we may reduce the number of quadrature points in the nip() option. The default number of points is nip(8). Keep in mind that by reducing the number of quadrature points, you may lose precision of estimates to some degree.

Lastly, note that in this model formulation, the discrimination parameter for the first item is constrained to 1 for model identification.

Three-parameter IRT models in gllamm

There is no standard way of fitting a three-parameter IRT model in gllamm, but it is possible to fit the model if the guessing parameters are known or via a profile likelihood approach. (See Rabe-Hesketh, S. and Skrondal, A. (2007). Multilevel and latent variable modelling with composite links and exploded likelihoods. Psychometrika 72, 123-140. Local )

Examples and documentation

Standard one and two-parameter IRT models
- Section 4.1 on One parameter and two parameter item-response models in Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2004). GLLAMM Manual. U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160.
  - Law School Admission Test (LSAT) data
- Zheng, X. and Rabe-Hesketh, S. (2007). Estimating parameters of dichotomous and ordinal item response models using gllamm. The Stata Journal 7, 313-333.
  - Datasets and do-files: Use these commands in Stata:
    net sj 7-3 st0129 net get st0129
- Hardouin, J. B. (2007). Rasch analysis: Estimation and tests with raschtest. The Stata Journal 7 (1), 22-44. (raschtest uses gllamm)
  - Datasets and do-files: Use these commands in Stata:
    net sj 7-1 st0119 net install st0119 net get st0119
Item response models with item and person predictors
- De Boeck, P. and Wilson, M. (Eds.) (2004). Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. New York: Springer.
  - Chapter 2. Descriptive and Explanatory Item Response Models
- Exercise 10.4 on Verbal aggression data in the book Rabe-Hesketh, S. and Skrondal, A. (2012). Multilevel and Longitudinal Modeling Using Stata (Third Edition). Volume II: Categorical Responses, Counts, and Survival. College Station, TX: Stata Press.
  - Download Chapter 10
  - Datasets for the book
- Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman & Hall/CRC.
  - Section 9.4 Arithmetic reasoning: Item response models

References

Embretson, S. E. and Reise, S. P. (2000). Item Response Theory for Psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.
Rabe-Hesketh, S. and Skrondal, A. (forthcoming). GLLAMM software. In van der Linden, W. J. and Hambleton, R. K. Handbook of Item Response Theory: Models, Statistical Tools, and Applications. Boca Raton, FL: Chapman & Hall/CRC Press, volume 3, chapter 30.
Rabe-Hesketh, S. and Skrondal, A. (2008). Classical latent variable models for medical research. Statistical Methods in Medical Research 17, 5-32. Local
Zheng, X. and Rabe-Hesketh, S. (2007). Estimating parameters of dichotomous and ordinal item response models using gllamm. The Stata Journal 7, 313-333.