Measurement equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Anxiety short forms in ethnically diverse groups

Jeanne A. Teresi, Katja Ocepek-Welikson, Marjorie Kleinman, Mildred Ramirez, Giyeon Kim


This is the first study of the measurement equivalence of the Patient Reported Outcomes Measurement Information System® (PROMIS®) Anxiety short forms in a large ethnically diverse sample.  The psychometric properties and differential item functioning (DIF) were examined across different racial/ethnic, educational, age, gender and language groups.   


These data are from individuals selected from cancer registries in the United States. For the analyses of race/ethnicity the reference group was non-Hispanic Whites (n = 2,263), the studied groups were non-Hispanic Blacks (n = 1,117), Hispanics (n = 1,043) and Asians/Pacific Islanders (n = 907). Within the Hispanic subsample, there were 335 interviews conducted in Spanish and 703 in English.  The 11 anxiety items were from the PROMIS emotional disturbance item bank.

DIF hypotheses were generated by content experts who rated whether or not they expected DIF to be present, and the direction of the DIF with respect to several comparison groups. The primary method used for DIF detection was the Wald test for examination of group differences in item response theory (IRT) item parametersaccompanied by magnitude measures.  Expected item scores were examined as measures of magnitude.  The method used for quantification of the difference in the average expected item scores was the non-compensatory DIF (NCDIF) index.  DIF impact was examined using expected scale score functions. Additionally, precision and reliabilities were examined using several methods.


Although not hypothesized to show DIF for Asians/Pacific Islanders, every item evidenced DIF by at least one method. Two items showed DIF of higher magnitude for Asians/Pacific Islanders vs. Whites:  “Many situations made me worry†and “I felt anxiousâ€.  However, the magnitude of DIF was small and the NCDIF statistics were not above threshold. The impact of DIF was negligible.  For education, six items were identified with consistent DIF across methods: fearful, anxious, worried, hard to focus, uneasy and tense.  However, the NCDIF was not above threshold and the impact of DIF on the scale was trivial.  No items showed high magnitude DIF for gender.  Two items showed slightly higher magnitude for age (although not above the cutoff): worried and fearful.  The scale level impact was trivial.  Only one item showed DIF with the Wald test after the Bonferroni correction for the language comparisons: “I felt fearfulâ€.  Two additional items were flagged in sensitivity analyses after Bonferroni correction, anxious and many situations made me worry.  The latter item also showed DIF of higher magnitude, with an NCDIF value (0.144) above threshold.  Individual impact was relatively small.

Conclusions:  Although many items from the PROMIS short-form anxiety measures were flagged with DIF, item level magnitude was low and scale level DIF impact was minimal; however, three items:  anxious, worried and many situations made me worry might be singled out for further study. It is concluded that the PROMIS anxiety short form evidenced good psychometric properties, was relatively invariant across the groups studied, and performed well among ethnically diverse subgroups of Blacks, Hispanic, White non-Hispanic and Asians/Pacific Islanders.  In general more research with the Asians/Pacific Islanders group is needed.  Further study of subgroups within these broad categories is recommended.   


anxiety; PROMIS; item response theory; differential item functioning; ethnic diversity


  • There are currently no refbacks.