How do I fit IRT models for binary responses?
Title |
|
Fitting binary IRT models in gllamm |
Author |
Minjeong Jeon, University of California, Berkeley |
Date |
July 2012 |
Item response data are for test or questionnaire data with responses
yij to I items i by J persons j.
It is assumed that a continuous latent trait θj, such as ability in the case
of test items, explains the item responses via a model such as
logit[P(yij=1|θj)] = θj - βi
where βi is the item difficulty. The model has one parameter per item and is called a one-parameter
logistic item response model. A discrimination parameter λi is sometimes introduced
to allow the effect of the latent trait on the log-odds of a correct response to differ between
items,
logit[P(yij=1|θj)]
= λiθj - βi
Data preparation
The data must be in long form, with all yij for
persons j and items i in one
variable. The data also require person and item identifiers
and item indicator (or dummy) variables.
For instance, if there are two students in school 1 who answered two items.
Your data should look like this:
pid item y i1 i2
1 1 0 1 0
1 2 1 0 1
1 1 1 1 0
2 2 0 0 1
...
pid is the person identifier and item
is the item identifier.
y represents item responses and i1 and
i2 are two item indicator
(or dummy) variables.
One-parameter IRT models in gllamm
Suppose there are 5 binary items in the data.
The syntax for fitting one-parameter IRT model is
gllamm y i1-i5, i(pid) family(bin) link(logit) noconstant adapt
i() specifies the person identifier.
The
family() and link() options specify
the conditional distribution of the responses
and the link function.
I used the logit link for the example binary data,
to specify a one-parameter logistic item response model.
One can choose
logit , probit ,
or cll (complementary log-log) links for binary
responses.
The noconstant option means that we omit the constant in the model so that
all the 5 item dummy variables can be used as predictors.
The adapt option means that we use the adaptive quadrature method.
Two-parameter IRT models in gllamm
To fit two-parameter IRT models, we need to specify equations for the discrimination parameters,
which are the factor loadings for latent variables (or person random effects or abilities).
The syntax for fitting two-parameter IRT model is
eq load: i1-i5
gllamm y i1-i5, i(pid) family(bin) link(logit) noconstant eqs(load) nip(6) adapt
Note that in the first line we define an equation, named load for items 1 to 5
using the eq command.
And then the eqs() option is used to specify
the variables i1 to i5
in the linear combination of variables that multiplies the latent variable in the model.
To speed up estimation, we may reduce the number of quadrature points in the nip() option.
The default number of points is nip(8) . Keep in mind that by reducing the number of
quadrature points, you may lose precision of estimates to some degree.
Lastly, note that in this model formulation,
the discrimination parameter for the first item
is constrained to 1 for model identification.
Three-parameter IRT models in gllamm
There is no standard way of fitting a three-parameter IRT model in gllamm ,
but it is possible to fit the model if the guessing parameters are known or
via a profile likelihood approach. (See Rabe-Hesketh, S. and Skrondal, A. (2007).
Multilevel and latent variable
modelling with composite links and exploded likelihoods.
Psychometrika 72, 123-140. Local
)
Examples and documentation
- Standard one and two-parameter IRT models
- Section 4.1 on One parameter and two parameter item-response models in
Rabe-Hesketh, S., Skrondal, A. and Pickles, A. (2004).
GLLAMM Manual.
U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 160.
- Item response models with item and person predictors
- De Boeck, P. and Wilson, M. (Eds.) (2004).
Explanatory Item Response Models: A Generalized Linear and Nonlinear Approach. New York: Springer.
- Exercise 10.4 on Verbal aggression data in
the book Rabe-Hesketh, S. and Skrondal, A. (2012).
Multilevel and Longitudinal Modeling Using Stata (Third Edition).
Volume II: Categorical Responses, Counts, and Survival.
College Station, TX: Stata Press.
- Skrondal, A. and Rabe-Hesketh, S. (2004).
Generalized Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman & Hall/CRC.
References
-
Embretson, S. E. and Reise, S. P. (2000).
Item Response Theory for Psychologists.
Mahwah, NJ: Lawrence Erlbaum Associates.
-
Rabe-Hesketh, S. and Skrondal, A. (forthcoming).
GLLAMM software.
In van der Linden, W. J. and Hambleton, R. K.
Handbook of Item Response Theory: Models,
Statistical Tools, and Applications.
Boca Raton, FL: Chapman & Hall/CRC Press, volume 3, chapter 30.
-
Rabe-Hesketh, S. and Skrondal, A. (2008).
Classical latent variable models for medical research.
Statistical Methods in Medical Research 17, 5-32.
Local
-
Zheng, X. and Rabe-Hesketh, S. (2007).
Estimating parameters of dichotomous and ordinal item response models using gllamm.
The Stata Journal 7, 313-333.
|