Bulletin of the World Health Organization

Developing the World Health Organization Disability Assessment Schedule 2.0

T Bedirhan Üstün a, Somnath Chatterji a, Nenad Kostanjsek a, Jürgen Rehm b, Cille Kennedy c, Joanne Epping-Jordan d, Shekhar Saxena a, Michael von Korff e, Charles Pull f & in collaboration with WHO/NIH Joint Project

a. World Health Organization, avenue Appia 20, 1211 Geneva 27, Switzerland.
b. University of Toronto, Toronto, Canada.
c. National Institutes of Health (NIH), Department of Health and Human Services, Washington, United States of America (USA).
d. Independent health consultant (formerly with the World Health Organization), Geneva, Switzerland.
e. Group Health Cooperative, Seattle, USA.
f. Centre Hospitalier de Luxembourg, Luxembourg.

Correspondence to T Bedirhan Üstün (e-mail: ustunb@who.int).

(Submitted: 13 August 2009 – Revised version received: 27 April 2010 – Accepted: 30 April 2010 – Published online: 20 May 2010.)

Bulletin of the World Health Organization 2010;88:815-823. doi: 10.2471/BLT.09.067231

Introduction

Information on disability is an important component of health information, as it shows how well an individual is able to function in general areas of life. Along with traditional indicators of a population's health status, such as mortality and morbidity rates, disability has become important in measuring disease burden, in evaluating the effectiveness of health interventions and in planning health policy. Defining and measuring disability, however, has been challenging. The World Health Organization (WHO) has tried to address the problem by establishing an international classification scheme known as the International Classification of Functioning, Disability and Health (ICF).1 Nevertheless, all standard instruments for measuring disability and health need to be linked conceptually and operationally to the ICF to allow comparisons across different cultures and populations.

To address this need for a standardized cross-cultural measurement of health status and in response to calls for improving the scope and cultural adaptability of the original World Health Organization Disability Assessment Schedule (WHODAS),29 WHO developed a second version (WHODAS 2.0) as a general measure of functioning and disability in major life domains. This paper reports on the development strategy and the metric properties of the WHODAS 2.0.

Conceptual framework for WHODAS 2.0

The WHODAS 2.0 is grounded in the conceptual framework of the ICF and captures an individual's level of functioning in six major life domains: (i) cognition (understanding and communication); (ii) mobility (ability to move and get around); (iii) self-care (ability to attend to personal hygiene, dressing and eating, and to live alone); (iv) getting along (ability to interact with other people); (v) life activities (ability to carry out responsibilities at home, work and school); (vi) participation in society (ability to engage in community, civil and recreational activities). All domains were developed from a comprehensive set of ICF items and made to correspond directly with ICF’s “activity and participation” dimension (Table 1), which is applicable to any health condition. For all six domains, the WHODAS 2.0 provides a profile and a summary measure of functioning and disability that is reliable and applicable across cultures in adult populations.

The WHODAS 2.0 is used for many purposes. It can be used for conducting population surveys,1015 for registers16 and for monitoring individual patient outcomes in clinical practice and in clinical trials of treatment effects.1727

Methods

The WHODAS 2.0 was constructed through a process involving extensive review and field-testing, as described in the following sections.

Existing measures

In preparation for the development of the WHODAS 2.0, we conducted a review of existing measurement instruments and of the literature on the conceptual aspects and measurement of functioning and disability. The instruments we chose included various measures of disability, handicap, quality of life and other aspects of health, such as the ability to perform the activities of daily living (including instrumental ones), as well as global and specific measures of well-being (including subjective well-being).28,29 We compiled information from more than 300 instruments in a database showing a common pool of items, along with the origin and known psychometric properties of each instrument. An Instrument Development Task Force composed of international experts reviewed the database and pooled the items in it using the ICF as the common framework.

Research study and field testing

Since the WHODAS 2.0 was developed primarily to allow cross-cultural comparisons, it was based on an extensive cross-cultural study spanning 19 countries around the world.30 The items included in the WHODAS 2.0 were selected after exploring how health status is assessed in different cultures through a process that involved linguistic analysis of health-related terms, interviews with key informants and focus group discussions, as well as qualitative methods (e.g. pile sorting and concept mapping).

The development of the WHODAS 2.0 also involved field testing across countries in two waves (Appendix A, available at: http://www.who.int/icidh/whodas/). Wave 1 focused on 96 items proposed for inclusion in the instrument being developed. In these initial field testing studies, empirical feedback was obtained on the metric qualities of the proposed items, possible redundancy, screener performance in predicting the results of the full instrument, rating scales and the suitability of different disability recall time frames (e.g. 1 week, 1 month, 3 months, 1 year or lifetime). The studies also included cognitive interviews to determine how well the respondents understood the questions and reacted to the contents of the instrument. The second of field testing studies involved checking the reliability of a shortened, 36-item version of the WHODAS 2.0 by means of a standard statistical procedure, in line with classic test and item response theory (IRT).

For each wave of field testing, the overall study design required the presence of four different groups at each site, all having an equal number of subjects. The groups were composed of: (i) members of the general population in apparent good health; (ii) people with physical disorders; (iii) people with mental or emotional disorders; and (iv) people with problems related to alcohol or drug use. Subjects 18 years of age or older, divided equally into males and females, were recruited at each site.

Statistical analysis

Reliability was assessed by having a different interviewer repeat the interviews one week later, on average. The results were expressed in terms of kappa and intraclass correlation coefficients. Internal consistency was assessed by calculating Cronbach's alpha (α) coefficient for patients at baseline. Pearson's correlation coefficient (r) was used to determine concurrent validity between the WHODAS 2.0 and other generic health status and disability measures. Principal components analysis was used to assess the construct validity of the scales. All items in the WHODAS 2.0 were tested against the Partial Credit Model for ordinality. The paired t-test was used for assessing the responsiveness of WHODAS 2.0 scores to clinical intervention.

Results

General application

The WHODAS 2.0 was found to perform well in widely different cultures, among different subgroups of the general population, among people with physical disorders and among those with mental health problems or addictions. Respondents found the questionnaire meaningful, relevant and interesting. The WHODAS 2.0 has already been translated into 27 languages following a rigorous WHO translation and back-translation protocol. Linguistic analysis and expert opinion survey results showed the content to be comparable and equivalent in different cultures, as was later confirmed by psychometric tests. The interview time was 5 minutes for the 12-item version and 20 minutes for the 36-item version. The 96-item version was found to require an interview time of 63−94 minutes.

In cognitive interviews, most respondents preferred the 30-day time frame and many pointed out problems in remembering with longer time frames. Regarding the concept of “difficulty”, some responders reported reasons other than health, including having too little time, too little money or too much to do – all of which were outside the definition of limitation in functioning due to a health condition.

Item reduction

Using the field trials data, we reduced to 34 the 96 items proposed for inclusion in the WHODAS 2.0 in accordance with classic test theory and item response theory. We also added two more items – one about sexual activity and another about the impact of the health condition on the family – based on suggestions from field interviewers and on the results of the expert opinion survey. A repeat survey confirmed the face validity of the resulting 36-item version. Scores in the six selected domains explained more than 95% of the variance in the total score on the 96-item version. Repeated factor analysis showed the same structure for all domains.

36-item factor structure

In all cultures and populations tested, factor analysis of the WHODAS 2.0 revealed a robust factor structure on two levels: a first level consisting of a general disability factor, and a second level composed of the six WHODAS representing different life areas (Fig. 1). On confirmatory factor analysis, the factor structure was similar across the different study sites and populations tested. The results of independent wave 2 field testing essentially replicated this factor structure as well.

Fig. 1. Factor structure of the World Health Organization Disability Assessment Schedule 2.0, 36-item version, in formative field studies
Fig. 1. Factor structure of the World Health Organization Disability Assessment Schedule 2.0, 36-item version, in formative field studies
ICC, intraclass correlation coefficient.

Internal consistency

Internal consistency, a measure of the correlation between items in a proposed scale, was very good for WHODAS 2.0 domains. Cronbach's α coefficients for the different domains were as follows: cognition (6 items), 0.86; mobility (5 items), 0.90; self-care (4 items), 0.79; getting along (5 items), 0.84; life activities for home (4 items), 0.98; life activities for work (4 items), 0.96; and participation in society (8 items), 0.84. Total internal consistency of the WHODAS 2.0 was 0.96 for 36 items.

IRT characteristics

WHODAS 2.0 showed very good IRT characteristics, indicative of the comparability of the assessment across different populations. In wave 1 of field testing, good response functions were one of the criteria for selecting items.

In field testing wave 2, items in the 36-item version fulfilled the Rasch characteristics. All items were compatible with specific objective measurements using a Partial Credit Model.

Test-retest reliability

The WHODAS 2.0 showed good test-rest reliability, a measure of the instrument's stability in repeated applications. Results of the reliability analysis are shown in Fig. 2 at the item, domain and general instrument levels. The intraclass correlation coefficient ranged from 0.69 to 0.89 at the item level and from 0.93 to 0.96 at the domain level, and it was 0.98 overall. More detailed analyses by country, region and demographic and other variables are reported separately.31

Fig. 2. Test–retest reliability of the World Health Organization Disability Assessment Schedule 2.0, 36-item versiona
Fig. 2. Test–retest reliability of the World Health Organization Disability Assessment Schedule 2.0, 36-item version<sup>a</sup>
ICC, intraclass correlation coefficient.a Field testing wave 2 (ntotal = 1565; n for the ICC depends on the domain, e.g. on how many subjects responded to all items at both time points: D1, 1448; D2, 1529; D3, 1430; D4, 1222; D5(1), 1399; D5(2), only with remunerated work, 808; D6, 1431).

Concurrent validity

Concurrent validity results, a measure of how well the WHODAS 2.0 results correlate with the results of other instruments that measure the same disability constructs, are summarized in Table 2. The table shows the correlation coefficients for relevant domains in comparisons with other instruments that are less widely known, such as the WHO Quality of Life measure (WHO QOL),32 the London Handicap Scale (LHS),33 the Functional Independent Measure (FIM)34 and the Short Form Health Survey (SF).3537 As expected, the highest correlation coefficients were found for specific domains measuring similar constructs, such as the FIM and WHODAS 2.0 mobility domains. Most other coefficients were between 0.45 and 0.65, which suggests not only that the WHODAS 2.0 and other recognized tests have similar constructs, but also that the WHODAS 2.0 is measuring something different. In addition, the WHODAS 2.0 score showed correlation in the number of days in which household tasks were reduced (r = 0.52) and in the number of absences from work lasting half a day or more (r = 0.63), respectively. The overall score on the WHODAS 2.0 was highly correlated with the overall score on the LHS (r = 0.75), the WHOQOL (r = 0.68) and the FIM (r = 0.68). It was less strongly correlated with SF mental health component scores (r = 0.17) because the SF measures signs of depression rather than functioning per se. What is important is that WHODAS 2.0 domain scores correlate highly with scores on comparable instruments designed to measure disability in specific areas (e.g. the FIM motor scale, r = 0.67 and the SF-36 Physical Component Score, r = 0.66). The correlation coefficients obtained indicate that the WHODAS 2.0 is measuring what it aims to measure (i.e. day-to-day functioning across a range of activity domains).

Subgroup analysis

WHODAS 2.0 is able to differentiate between special types of disabilities in patients belonging to different clinical subgroups. Domain and total scores are shown in Fig. 3. Results for disability domain profiles for different populations were all in the expected direction. For example, the group with physical health problems showed higher scores in “getting around”, whereas groups with mental health problems and drug problems showed higher scores in “getting along with people”. This confirms that the instrument has face validity. People drawn from the general population got lower scores in all domains and a lower general score than people in specific treatment subgroups. Individuals on treatment for mental problems or addictions reported more difficulty with cognitive activities and with getting along than patients on treatment for physical problems, who showed greater difficulty (i.e. scored higher) getting around and performing self-care. Participation in community activities was most difficult for drug users.

Fig. 3. World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0): domain profile by subgroup
Fig. 3. World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0): domain profile by subgroup

Screening properties

In wave 2 field trials, the 12-item short version of the WHODAS 2.0 explained 81% of the variance of the 36-item version. For each domain, the 12-item version included two sentinel items with good screening properties that identified over 90% of all individuals with even mild disabilities when tested on all 36 items.

Scoring WHODAS 2.0

Multiple ways to score WHODAS 2.0 were compared in terms of their information value and practicality in daily use. As a result, two ways to compute the summary scores, namely simple and complex scoring, were found useful. In simple scoring, the scores assigned to each of the items (none, 1; mild, 2; moderate, 3; severe, 4; and extreme, 5) are summed up without recoding or collapsing response categories. Simple scoring is as practical as hand scoring and may be preferable for busy clinical settings or interviews. The simple scoring of WHODAS 2.0 is only specific to the sample at hand and should not be assumed to be comparable across populations. The psychometric properties of the WHODAS 2.0, namely its one-dimensional structure with high internal consistency, make it possible to add the scores.38 In complex scoring, also known as item response theory-based scoring,39 multiple levels of difficulty for each WHODAS 2.0 item are allowed for. Complex scoring makes more fine-grained analyses possible, since the information for the response categories is used in full for comparative analysis across populations or subpopulations. With item response theory-based scoring for WHODAS 2.0, each item response (none, mild, moderate, severe and extreme) is treated separately and the summary score is generated with a computer by differentially weighting the items and the levels of severity.

In addition to the total scores, WHODAS 2.0 also makes it possible to compute domain-specific scores for cognition, mobility, self-care, getting along, life activities (at home and at work) and social participation. We used SPSS software, version 10 (SPPS Inc., Chicago, United States of America), to compute the summary score. Both the program and the domain scores are available at: http://www.who.int/icidh/whodas/

WHODAS 2.0 provides standard scores for the general population derived from large international samples against which individuals or groups can be compared, as was done in the reliability and validity study conducted in wave 2 of the WHODAS 2.0 development process; and the WHO Multi-Country Survey Study.12 Fig. 4 gives the population standard scores for IRT-based scoring of the 36-item WHODAS 2.0. Accordingly, an individual with 22 positive item responses would represent the 80th percentile. Summary scores and population percentiles for item response theory-based scoring of the 12-item WHODAS 2.0 are also available. Details are available in the WHODAS 2.0 training manual40 and at: http://www.who.int/icidh/whodas/

Fig. 4. Population distribution of scores on the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0), 36-item version
Fig. 4. Population distribution of scores on the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0), 36-item version
IRT, item response theory.

Responsiveness

When the mean standardized response (that is, the change in mean score divided by the standard deviation of the change in score) was used as a measure of effect size, the WHODAS 2.0 was found to be at least as sensitive to change as comparable functioning scales, For example, Fig. 5 shows WHODAS 2.0 responsiveness as noted in the case of treatment for depression in patients from four different countries. Effect sizes for the WHODAS 2.0, which ranged from 0.44 to 1.07, are comparable to those obtained with established functioning scales. Similar effect sizes (0.44–1.38) were obtained for interventions targeting individuals with schizophrenia, osteoarthritis, back pain and alcohol dependence.41

Fig. 5. Responsiveness (sensitivity to change) of the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0), 36-item version (SF 36), as noted in the case of treatment for depression
Fig. 5. Responsiveness (sensitivity to change) of the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0), 36-item version (SF 36), as noted in the case of treatment for depression
LHS, London Handicap Scale; MCS, mental component summary.a The effect size represents the change in mean value divided by one standard deviation.

Discussion

Stringent tests performed during WHODAS 2.0 development have shown that the WHODAS 2.0 can be used across cultures, sexes and age groups, as well as for different types of diseases and health conditions. The instrument covers key life activities well. The12-item version of the WHODAS 2.0 can be administered in less than 5 minutes and the 36-item version in less than 20 during interviews and in 5 to 10 minutes when self-administered or administered by proxy. Scores are easily obtained and interpreted. They represent multidimensional disability based on the ICF, and the underlying factor structure is robust. Details and instructions on how to administer different versions of the WHODAS 2.0 and compute its scores can be found in the WHODAS 2.0 training manual.40

WHODAS 2.0 has good psychometric qualities, including good reliability and item-response characteristics, and its robust factor structure remains the same across cultures and in different patient populations. It shows concurrent validity when compared with other measures of disability or health status or with clinician ratings. These findings have been replicated across different countries and in a wide range of patient and general population samples. Thus, the WHODAS 2.0 can be used to assess individual patients as well as to explore differences between groups.

Field trials of the use of WHODAS 2.0 in health services research have focused on responsiveness, that is, on how well WHODAS 2.0 can detect changes following treatment under specific conditions. We use the WHODAS 2.0 to predict disability-related outcomes such as health care utilization, costs and work productivity, and we have compared its predictive validity to that of other disability measures.

In the Multi-country Survey Study that was conducted in 12 WHO Member States, WHODAS 2.0 was administered to randomly selected adults from the general population in face-to-face interviews.12 These surveys have been used to formulate a descriptive system of disability weights (i.e. utilities) for use in summary measures of population health, such as disability-adjusted life years. Econometric methods, such as time trade-off or person trade-off tools, have proved useful in eliciting disability weights. However, the application of these methods in general population surveys is problematic. Descriptive methods, such as application of WHODAS 2.0, are not only easier to apply but also yield more reliable indices for disability weights.

The WHODAS 2.0 has several limitations. It covers mainly the activities and participation domains of the ICF, so bodily impairments and environmental factors are not included. This design decision was made during the initial phase of development. However, work is under way to develop an additional module for bodily impairments.40 Furthermore, the WHODAS 2.0 is only applicable to adult populations. After the ICF for children and youth (ICF-CY) was published in 2007, plans were initiated to develop a version of the WHODAS 2.0 for children and youth.42

The WHODAS 2.0 framework can be applied in different formats for uses such as clinical interviews or telephone interviews. The feasibility and reliability of these applications are currently being determined. Computer-adaptive testing, a novel method for shortening the application, will enhance the feasibility of using the WHODAS 2.0 in different studies. Population standard norms will be continuously improved during future applications of the WHODAS 2.0. Similarly, item banking for different clinical intervention trials will enable comparative effectiveness studies.

In summary, WHODAS 2.0 has the potential to serve as a reliable and valid tool for assessing functioning and disability across countries, populations and diseases. It provides data that are culturally meaningful and comparable. Thus, it can be used as a common metric for assessing the level of functioning in individuals with different health conditions as well as in the general population. The WHODAS 2.0 can be used in surveys and in clinical research settings and it can generate information of use in evaluating health needs and the effectiveness of interventions to reduce disability and improve health.


Acknowledgements

The Task Force on Assessment Instruments also included: Elizabeth Badley, Cille Kennedy, Ronald Kessler, Michael Von Korff, Martin Prince, Karen Ritchie, Ritu Sadana, Gregory Simon, Robert Trotter and Durk Wiersma.

The following are the WHO collaborative investigators involved in the WHO/NIH Joint Project: Gavin Andrews (Australia); Thomas Kugener (Austria); Kruy Kim Hourn (Cambodia); Yao Guizhong (China); Jesús Saiz (Cuba); Venos Malvreas (Greece); R Srinivasan Murty (India, Bangalore); R Thara (India, Chennai); Hemraj Pal (India, Delhi); Matilde Leonardi, Ugo Nocentini (Italy); Miyako Tazaki (Japan); Elia Karam (Lebanon); Charles Pull (Luxembourg); Hans Wyirand Hoek (Netherlands); AO Odejide (Nigeria); José Luis Segura García (Peru); Radu Vrasti (Romania); José Luis Vásquez Barquero (Spain); Adel Chaker (Tunisia); Berna Ulug (Turkey); Nick Glozier (United Kingdom); Patrick Doyle, Katherine McGonagle, Michael von Korff (United States of America). A full list of collaborators is available at: http://www.who.int/icidh/whodas/

Funding:

The WHODAS 2.0 development was funded through the WHO/National Institutes of Health (NIH) Joint Project on Assessment and Classification of Disability (MH 35883–17).

Competing interests:

None declared.

References

Share