1
INTERNATIONAL TEST COMMISSION
International Test Commission
Guidelines for Translating and
Adapting Tests
Version 2010
© 2010, International Test Commission.
Please cite as:
International Test Commission (2010). International Test Commission Guidelines for Translating and
Adapting Tests. [http://www.intestcom.org]
2
International Test Commission Guidelines for Translating and Adapting Tests
In 1992 the International Test Commission (ITC) began a project to prepare
guidelines for translating and adapting tests and psychological instruments, and
establishing score equivalence across language and/or cultural groups. Several
organizations assisted the ITC in preparing the guidelines: European Association of
Psychological Assessment, European Test Publishers Group, International Association
for Cross-Cultural Psychology, International Association of Applied Psychology,
International Association for the Evaluation of Educational Achievement, International
Language Testing Association and International Union of Psychological Science. A
committee of 12 representatives from these organizations worked for several years to
prepare 22 guidelines, and later these guidelines were field-tested (see, for example,
Hambleton, 2001; Hambleton, Merenda, & Spielberger, 2005; Hambleton, Yu, & Slater,
1999; Tanzer & Sim, 1999). Later, these guidelines were approved by the ITC for
distribution to national psychological societies, test publishers, and researchers. The
guidelines, organized into four categories, appear below:
Context
C.1 Effects of cultural differences which are not relevant or important to the main
purposes of the study should be minimized to the extent possible.
C.2 The amount of overlap in the construct measured by the test or instrument in the
populations of interest should be assessed.
Test Development and Adaptation
D.1 Test developers/publishers should insure that the adaptation process takes full
account of linguistic and cultural differences among the populations for whom
adapted versions of the test or instrument are intended.
D.2 Test developers/publishers should provide evidence that the language use in the
directions, rubrics, and items themselves as well as in the handbook are
appropriate for all cultural and language populations for whom the test or
instrument is intended.
D.3 Test developers/publishers should provide evidence that the choice of testing
techniques, item formats, test conventions, and procedures are familiar to all
intended populations.
D.4 Test developers/publishers should provide evidence that item content and
stimulus materials are familiar to all intended populations.
D.5 Test developers/publishers should implement systematic judgmental evidence,
both linguistic and psychological, to improve the accuracy of the adaptation
process and compile evidence on the equivalence of all language versions.
D.6 Test developers/publishers should ensure that the data collection design permits
the use of appropriate statistical techniques to establish item equivalence between
the different language versions of the test or instrument.
D.7 Test developers/publishers should apply appropriate statistical techniques to (1)
establish the equivalence of the different versions of the test or instrument, and (2)
3
identify problematic components or aspects of the test or instrument which may
be inadequate to one or more of the intended populations.
D.8 Test developers/publishers should provide information on the evaluation of
validity in all target populations for whom the adapted versions are intended.
D.9 Test developers/publishers should provide statistical evidence of the equivalence
of questions for all intended populations.
D.10 Non-equivalent questions between versions intended for different populations
should not be used in preparing a common scale or in comparing these
populations. However, they may be useful in enhancing content validity of scores
reported for each population separately.
Administration
A.1 Test developers and administrators should try to anticipate the types of problems
that can be expected, and take appropriate actions to remedy these problems
through the preparation of appropriate materials and instructions.
A.2 Test administrators should be sensitive to a number of factors related to the
stimulus materials, administration procedures, and response modes that can
moderate the validity of the inferences drawn from the scores.
A.3 Those aspects of the environment that influence the administration of a test or
instrument should be made as similar as possible across populations of interest.
A.4 Test administration instructions should be in the source and target languages to
minimize the influence of unwanted sources of variation across populations.
A.5 The test manual should specify all aspects of the administration that require
scrutiny in a new cultural context.
A.6 The administrator should be unobtrusive and the administrator-examinee
interaction should be minimized. Explicit rules that are described in the manual
for administration should be followed.
Documentation/Score Interpretations
I.1 When a test or instrument is adapted for use in another population, documentation
of the changes should be provided, along with evidence of the equivalence.
I.2 Score differences among samples of populations administered the test or
instrument should not be taken at face value. The researcher has the
responsibility to substantiate the differences with other empirical evidence.
I.3 Comparisons across populations can only be made at the level of invariance that
has been established for the scale on which scores are reported.
I.4 The test developer should provide specific information on the ways in which the
socio-cultural and ecological contexts of the populations might affect
performance, and should suggest procedures to account for these effects in the
interpretation of results.
The guidelines and suggestions for implementing them can be found in
Hambleton, Merenda, and Spielberger (2005), Muniz and Hambleton (1997), van de
Vijver and Hambleton (1996), and van de Vijver and Tanzer (1997). The best reference
4
for citing the guidelines is Hambleton, Merenda, and Spielberger (2005, chapter 1).
These guidelines have become a frame-of-reference for many psychologists working in
the test translation and adaptation area, and more general adoption of the guidelines can
be expected in the coming years as the guidelines are more widely disseminated and the
standards for translating and adapting tests are increased.
From a practical point of view, two major contexts can be distinguished for
applying the ITC guidelines: (1) the translation/adaptation of existing tests and
instruments, and (2) the development of new tests and instruments for international use.
The first context refers to the situation where tests and instruments that have originally
been developed in a particular language for use in some national context are to be made
appropriate for use in one or more other languages and/or national contexts. Often in
such cases the aim of the translation/adaptation process is to produce a test or instrument
with comparable psychometric qualities as the original.. Even for non-verbal tests,
adaptations are necessary not only of the accompanying verbal materials for
administration and score interpretation but also of graphic materials in the test proper to
avoid cultural bias. Growing recognition of multiculturalism has raised awareness of the
need to provide for multiple language versions of tests and instruments intended for use
within a single national context.
The second context refers to the development of tests and instruments that from
their conception are intended for international comparisons. The advantage here is that
versions for use in different languages and or different national contexts can be
developed in parallel, i.e., there is no need to maintain a pre-existing set of psychometric
qualities. The problem here often lies in the sheer size of the operation: the large number
of versions that need to be developed and the many people involved in the development
process.
References
Hambleton, R. K. (2001). The next generation of the ITC test translation and adaptation
guidelines. European Journal of Psychological Assessment, 17(3), 164-172.
Hambleton, R. K., Merenda, P., & Spielberger, C. (Eds.). (2005). Adapting educational
and psychological tests for cross-cultural assessment. Hillsdale, NJ: Lawrence S.
Erlbaum Publishers.
Hambleton, R. K., Yu, J., & Slater, S. C. (1999). Field-test of the ITC Guidelines for
Adapting Psychological Tests. European Journal of Psychological Assessment,
15, 270-276.
Muniz, J., & Hambleton, R. K. (1997). Directions for the translation and adaptation of
tests. Papeles del Psicologo, August, 63-70.
5
Tanzer, N. K., & Sim, C. O. E. (1999). Adapting instruments for use in multiple
languages and cultures: A review of the ITC guidelines for test adaptations.
European Journal of Psychological Assessment, 15, 258-269.
van de Vijver, F. J. R., & Hambleton, R. K. (1996). Translating tests: Some practical
guidelines. European Psychologist, 1, 89-99.
van de Vijver, F. J. R., & Tanzer, N. K. (1997). Bias and equivalence in cross-cultural
assessment: An overview. European Review of Applied Psychology, 47(4), 263-
279.
Current Version: January 17, 2010