Sie sind auf Seite 1von 8

92

Evaluating the Effectiveness of Stroke Rehabilitation:


Choosing a Discriminative Measure
Kim A. Brock, PhD, Patricia A. Goldie, PhD, Kenneth M. Greenwood, PhD
ABSTRACT. Brock KA, Goldie PA, Greenwood KM. discussion about valid methods for outcome evaluation. With-
Evaluating the effectiveness of stroke rehabilitation: choosing out evidence of outcomes, measures of resource use may be
a discriminative measure. Arch Phys Med Rehabil 2002;83: used to justify spending reductions when “the true need is to
92-9. maximize health benefits for the money spent.”1 Benchmarking
Objective: To evaluate the discriminative ability of several processes for analyzing costs of delivering health care are more
measures of physical disability used to determine quality of advanced than processes for determining outcomes of health
outcome for poststroke rehabilitation. care. Although this difference exists, changes in service deliv-
Design: A comparative study, using Rasch analysis, of the ery may be driven mainly by cost factors, with insufficient
discriminative ability of functional status and mobility mea- regard to the quality of outcomes associated with the changes.
sures in rehabilitation patients with stroke. A major issue in pursuing outcome-based benchmarking is
Setting: A 26-bed rehabilitation unit, on site of a tertiary the adequacy of the tool used to measure outcome. Tools used
teaching hospital in Melbourne, Australia. for this purpose must be reliable, valid, and discriminative.
Participants: A consecutive sample of 106 patients with Examining discriminative ability is important to ensure that a
acute stroke admitted for rehabilitation. chosen outcome measure is able to differentiate within the
Interventions: Not applicable. patient group and to identify meaningful differences in pa-
Main Outcome Measures: Rasch analysis of the motor tients’ abilities.2 A discriminative scale will have items that
subscale of the FIM™ instrument, Motor Assessment Scale, span the range of patient ability, with appropriately spaced
Functional Ambulation Classification, gait velocity, and gait intervals. Failure to differentiate among patients at higher lev-
endurance. els of function may produce inaccurate evaluations of the
Results: The more difficult items of the FIM motor scale effectiveness of rehabilitation units.
adequately discriminated among higher functioning patients. Functional status measures such as the FIM™ instrument,3
The gait velocity measure further distinguished 9% of the which assess the assistance required for a person to complete
sample, who functioned at a higher level than could be indi- daily living tasks, are the most widely used outcome measures.
cated by FIM motor subscale. The other measures did not add Concerns have been expressed about possible ceiling effects of
levels of discrimination to that provided by the FIM motor. these ordinal disability scales.4-8 Dodds et al4 argue that the
Ability estimates provided by Rasch analysis of the FIM motor FIM does not take into account elements such as speed, ease,
scale were a more accurate indication of ability than raw and quality of task performance. Johnston et al9 argue that
scores. Raw scores underestimated change in ability observed ordinal functional status measures are responsive to improve-
at higher levels of ability. ment in a limited part of the range only, with both floor and
Conclusion: Rasch estimates of the FIM motor subscale ceiling effects. For patients with less disability, Johnston sug-
provide a discriminative measure for evaluating outcomes and gests that measures of speed and endurance may be important.
change in ability achieved in stroke rehabilitation. In light of these comments, it may be useful to compare the
Key Words: Cerebrovascular disorders; Disability evalua- discrimination of the FIM motor subscale with other measures
tion; Outcome assessment (health care); Rehabilitation. in the physical domain that include aspects such as speed and
© 2002 by the American Congress of Rehabilitation Medi- quality of task performance.
cine and the American Academy of Physical Medicine and The present study considered outcome measures that reflect
Rehabilitation mobility function. In the rehabilitation of motor function, much
emphasis is placed on the quality and ease of walking and of
he search for appropriate outcome measures has been a motor tasks such as moving from sitting to standing. This study
T focus of rehabilitation research for decades. Introducing
benchmarking processes into health care, which facilitates
investigates whether the motor subscale of the FIM adequately
represents mobility function after rehabilitation.
comparisons of resource use and outcomes, has intensified The appropriateness of mathematically manipulating ordinal
values of functional status measures, such as summing scores
and calculating change values, has also come under question in
the literature.10-12 Item response theory, using Rasch analysis,
From the Physiotherapy Department, St Vincent’s Hospital, Fitzroy, Melbourne, has been used to investigate the properties of ordinal scales,
Australia (Brock); and Schools of Physiotherapy (Goldie) and Psychological Science
(Greenwood), La Trobe University, Bundoora, Victoria, Australia.
including the FIM. With Rasch analysis, ordinal scales are
Accepted in revised form March 7, 2001. converted into interval measures by estimating the increase in
Presented at the Joint Annual Scientific Meeting of the Australasian Faculty of ability for each level of the scale, for each of the items.12,13
Rehabilitation Medicine, the British Society of Rehabilitation Medicine, and the Several studies have shown the validity of developing Rasch
International Association of Health Professionals in Rehabilitation, May 27-30, 1998,
Sydney, Australia.
estimates of ability from the FIM.14-16 Linacre et al15 showed
No commercial party having a direct financial interest in the results of the research that the steps between scores on the FIM motor scale are not
supporting this article has or will confer a benefit upon the author(s) or upon any equal in level of difficulty. A change in ability at the top end of
organization with which the author(s) is/are associated. the scale was of greater magnitude when converted into Rasch
Reprint requests to Kim Brock, PhD, Physiotherapy Dept, St Vincent’s Hospital,
Victoria Pde, Fitzroy, 3065 Melbourne, Australia, e-mail: brockka@svhm.org.au.
estimates than a similar change in ability toward the middle of
0003-9993/02/8301-6469$35.00/0 the scale. A 10-point change in raw data scores from 80 to 90,
doi:10.1053/apmr.2002.27348 when converted to Rasch estimates, indicates 4 times as much

Arch Phys Med Rehabil Vol 83, January 2002


CHOOSING A DISCRIMINATIVE MEASURE, Brock 93

change as a change in raw scores from 40 to 50. This finding key advantage to using Rasch analysis in the evaluation of
indicates that summed raw scores cannot be considered to be an rating scales is the potential to include measures that do not
interval scale for the FIM motor subscale. However, the reha- cover the whole spectrum of ability but are more sensitive to a
bilitation field seems slow to adopt the use of Rasch estimates, portion of the range.46 This provides a method for integrating
and summed raw scores remain the more frequently reported measures, inclusive of measures that are not as discriminative
indicator of outcome. but cover the range of ability, and measures that discriminate
The purpose of the present study was to compare several well, but only in a portion of the range.
outcome measures, using Rasch analysis, to distinguish which Few studies have used Rasch analysis to undertake compar-
measure or combination of measures is the most discriminative ative analysis of the discriminative ability of scales. Fisher et
for assessing physical disability at the end of intensive reha- al47 used it to compare the discriminative abilities of the FIM
bilitation after stroke. motor and 22 items from the motor competency components of
the Patient Evaluation Conference System, including items
METHODS assessing housework and meal preparation. Grimby et al48
combined the physical activities of the FIM with the Instru-
Measures of Physical Ability mental Activity Measure (IAM). The IAM assesses domestic
We examined 3 ordinal scales: the 13 items that make up the tasks, such as cleaning and shopping, and community mobility.
motor component of the FIM motor subscale3 (FIM motor), the The aim of using Rasch analysis to investigate outcome
Motor Assessment Scale17 (MAS), and the Functional Ambu- measures in the present study was to establish which measure,
lation Classification18 (FAC). The FIM motor assesses the level or combination of measures, would yield a single ability score
of assistance required to perform various activities of daily for each patient that most accurately reflects mobility outcome.
living. Extensive investigations of the FIM’s reliability and This score could represent either functional level at outcome or
validity have provided evidence of its interrater and test-retest change in function between admission and discharge. Having
reliability,19 internal consistency,20,21 concurrent validity,22,23 established a discriminative measure, comparisons of outcome
and predictive validity.23-26 in relation to resource use could then be undertaken with
The MAS17 assesses 5 functional motor tasks (supine to side greater confidence.
lie, supine to sitting, sitting, sit to stand, walking) by using At the time of the study (1993–1995), the funding of reha-
criteria that address the quality of performance as well as the bilitation services in Victoria was provided by block grants,
level of assistance required. The instrument’s quality aspects allowing rehabilitation teams to set individual goals and pro-
include symmetry, control, timing of movement, and use of the grams for each patient, without being restricted by specific
affected side. The upper limb components of the MAS were not requirements of funding agencies. This approach permitted
considered in the present study. Interrater reliability,17,27,28 wide variation in length of stay (LOS) in the rehabilitation unit
concurrent validity,28,29 and predictive validity30 have been at St Vincent’s Hospital, to achieve high levels of outcome,
shown for the MAS. with a large proportion of discharges to independent living in
The FAC18 addresses walking ability relevant to community the community. During the study, few alternatives to inpatient
ambulation, such as ability to walk on rough ground, ramps, care were available for intensive rehabilitation services in
and over curbs. Limited reliability and validity evaluations Victoria. Almost all patients in the present study had stopped
have been undertaken, including interrater reliability31 and participating in intensive rehabilitation at discharge; some pa-
concurrent validity with other gait measures.18,31,32 tients had prolonged inpatient stays to achieve sufficient func-
We included 2 interval scaled measures of walking ability— tion to return home. Therefore, in this study, discharge from
gait velocity and endurance—in the present study, to compare inpatient rehabilitation adequately represents the end point of
the discriminative ability of ordinal scales and specific interval intensive rehabilitation.
measures. Gait velocity is a widely used measure of function
at the end of rehabilitation. Few stroke patients achieve Subjects
normal gait velocity, with a recent study finding that 83% of All patients admitted to the St Vincent’s Hospital Rehabili-
stroke patients were still impaired in terms of gait speed at tation Unit from 1993 to 1995 with a primary diagnosis of
3 months poststroke.33 Studies have provided evidence of in- stroke, infarct, or hemorrhage were included in the study.
terrater reliability,18,34,35 test-retest reliability,18,35,36 concurrent Patients were excluded if they suffered another major incident
validity,18,31,32,37-41 and predictive validity.42 Previous stud- at onset of stroke (eg, fracture, amputation), or had cerebro-
ies36,43,44 have shown gait velocity to be a discriminative mea- vascular accidents related to trauma. Patients were selected for
sure. admission to the rehabilitation unit according to criteria that
Few functional status measures include an assessment of included potential to return to some form of independent living
endurance beyond the 50-meter requirement of the FIM. An and the ability to cope with an intensive rehabilitation program.
audit of stroke outcomes undertaken by Hill et al45 showed that The study was approved by the ethics committees of St Vin-
only 15% of patients could walk more than 500 meters at cent’s Hospital and the La Trobe University Faculty of Health
discharge. To date, there are no simple tests of endurance with Sciences.
established reliability and validity. We included a standardized One hundred six patients were included in the study. The
test of endurance in the present study, requiring the subject to patients had an average age ⫾ standard deviation of 68.7 ⫾
walk laps of a 50-meter circuit. 11.3 years (range, 36 –91yr). Sixty-seven (63%) were men.
If the areas of function represented by the 4 ordinal scales Forty-nine patients (46%) had right-hemisphere lesions, 48
and 2 interval measures are shown to be measuring the same (45%) left-hemisphere lesions, and 9 patients (9%) had bilat-
domain (physical disability), then Rasch analysis could be used eral lesions. Eighteen patients (17%) had hemorrhagic strokes
to compare difficulty levels of items in the disability scales and and 87 patients (82%) had infarcts. The pathology was un-
the gait measures. This comparison would then permit one to known for 1 patient. Before admission, 27% of patients lived
identify the measures that most adequately differentiate be- alone, 49% lived with a spouse, 17% with their family, and 7%
tween various levels of ability at the higher end of the spectrum lived in special accommodation houses or hostels. Table 1
of functional ability, observed at the end of rehabilitation. A shows median time from onset of stroke to admission to reha-

Arch Phys Med Rehabil Vol 83, January 2002


94 CHOOSING A DISCRIMINATIVE MEASURE, Brock

Table 1: Hospital LOS and Functional Status of Cohort estimate’s internal validity and indicate the unidimensionality
25th 75th
of the model.51
Median Percentile Percentile Range The stability of the measures over time was investigated to
determine whether the relative level of difficulty of the items
LOS (d) remained constant when assessed at different points in time.15
Onset to admission 11 8 15.5 0–52 This attribute is essential to the validity of a measure used for
Rehabilitation (d) 28 16 68 3–274 quantifying change. The stability of FIM motor scores has been
Functional status shown in several studies.15,16,52,53 The stability of the measures
(summed raw scores) used in the present study was examined by using intraclass
Admission FIM motor 67 51 80 18–91 correlation coefficients (ICCs), as recommended by Chang and
Discharge FIM motor 86 81.5 89 32–91 Chan.53
Rasch analysis does not provide case estimates for perfect
scores. If a considerable portion of the sample achieve a perfect
score, one must determine the level of ability this score indi-
bilitation, LOS in rehabilitation, and median admission and cates. Otherwise, this group’s achievements are not considered
discharge FIM motor scores. At the end of rehabilitation, 89% when assessing change. Case estimates for perfect scores on
of patients returned home, 7% to semi-independent living, and the FIM motor and MAS were established by adding a more
4% to nursing home care. difficult item from another measure to the functional scale and
performing another Rasch analysis.
Instruments
The items on the functional status measures that discrimi-
The FIM and MAS were assessed at admission and dis- nated among higher level performances were then compared by
charge. Additional mobility variables of gait speed, endurance, entering them into a further Rasch analysis. The purpose of
and rating on the FAC were assessed at discharge. which was to identify the functional status measure, or com-
bination of measures, that most adequately discriminated
Procedure among higher level performances.
Patients were assessed at admission and at discharge from Data analysis was performed on a personal computer, using
rehabilitation. The FIM motor scores were evaluated according the computer software packages SPSS, version 6.0,a and
to the FIM guidelines by rehabilitation team members consid- QUEST.b
ering the patient’s performance over a 24-hour period. During
the study period, training and examination of FIM reliability RESULTS
was performed at least twice yearly, and at least 10 staff on the
Rehabilitation Unit achieved satisfactory reliability, as mea- Spread of Scores
sured by the Uniform Data System for Medical Rehabilitation The measures were examined for ceiling effects at discharge.
examination process, at any 1 time. Table 2 shows the proportion of patients achieving the highest
The MAS was assessed by 1 of 3 physiotherapist raters, all score for each of the measures and on the hardest item of the
of whom had undergone reliability testing on the MAS through measure. The FAC had the strongest ceiling effect (table 2).
the School of Physiotherapy, University of Sydney, achieving The endurance test revealed that 39% of subjects were able to
satisfactory reliability scores of above 80%. Walking speed walk 500 meters on outdoor surfaces without a rest. In contrast,
was assessed by the treating physiotherapist as unassisted, the FIM motor subscale had 16% and the MAS had 25% of
self-selected walking speed over the central 6 meters of a persons scoring the highest score.
10-meter walkway. The use of aids and splints customarily Gait velocity has no ceiling effect. However, to facilitate
worn was allowed and recorded. The endurance test required comparison, the gait velocity variable was grouped into 8
the patient to walk laps of a 50-meter circuit outdoors. The classes, with the fastest walkers being those who walked at
patients were asked to “walk as many laps as you can, without more than 70m/min. This cutoff was chosen on the basis of a
tiring yourself, to a maximum of 10.” The FAC was rated by study54 investigating the gait velocity required to safely cross
the treating physiotherapist. signalled road intersections in Melbourne. Patients with a self-
selected gait velocity of 70m/min have sufficient gait velocity
Statistical Analysis to safely cross almost all signalled intersections in Melbourne,
Discriminative ability was evaluated by examining the at their normal walking speed. Nine percent of patients
spread of scores for ceiling effects and through Rasch analysis. achieved this optimum speed. This test showed the least ceiling
The use of Rasch analysis is dependent on the underlying effect.
variations in behavior being dominated by 1 dimension.12 Be- Although the FAC and endurance showed large ceiling ef-
fore subjecting the data to Rasch analysis, we applied principal fects, these single-item tests may yield useful information in
components analysis49,50 to establish the unidimensionality of combination with other tests. The validity of combining gait
the measures used, for the measures individually and for com-
bined data sets.
At the end of intensive rehabilitation, we calculated item Table 2: Percentage of Patient’s Achieving the Highest Score for
estimates (level of difficulty of items) and case estimates (level Each Measure
of ability of persons tested) by using Rasch analysis for each
functional measure. These estimates are measured as logits, Total (%) Most Difficult Item (%)
mathematically defined units that are constant from 1 end of FIM motor 16 29
a continuum to the other. Comparing estimates allowed us to MAS 25 35
compare both the difficulty level of items on a scale and the Gait velocity (grouped) 9
ability level of patients. Rasch analysis also provides fit Endurance 39
statistics, which indicate how the observed ratings vary from FAC 46
those predicted by the model. These fit statistics show the

Arch Phys Med Rehabil Vol 83, January 2002


CHOOSING A DISCRIMINATIVE MEASURE, Brock 95

endurance, the FAC, and gait velocity into a variable repre-


senting gait function was explored.
Principal Components Analysis
Principal components analysis of FIM motor scores showed
that the scale closely approached a unidimensional scale: 2
factors had eigenvalues above 1.0 (8.6, 1.2). The items of
“bowel” function and “eating” loaded most strongly onto the
second factor. Principal components analysis of MAS scores
revealed 1 factor with an eigenvalue above 1.0. We entered the
3 items assessing gait (gait velocity, gait endurance, FAC
score) into principal components analysis to establish their
unidimensionality. One factor was revealed with an eigenvalue
above 1.0. Because all the measures met the criteria of unidi-
mensionality to a reasonable degree, the measures were further
examined with Rasch analysis. Fig 2. Summed raw scores and Rasch estimates of FIM motor
scores at discharge. (Rasch estimates and summed raw scores were
Rasch Analysis rescaled to range between 0 and 100.)
FIM motor. Rasch analysis was performed on the dis-
charge FIM motor scores. One item, “bladder” function,
showed serious misfit, as indicated by the infit statistic (2.61). though the present value falls short of that .90 threshold, it is
Figure 1 shows the estimated difficulty of each level of the 13 close to the specified level. Because 3 previous studies16,52,53
items. Item levels that are scored similarly in the FIM motor found the FIM motor’s stability to be acceptable, we combined
may have markedly different weightings in the Rasch analysis. the admission and discharge scores in the present study. The
The FIM motor item estimates obtained in this study were item thresholds developed on the combined data set are used
compared with those quoted in the study by Linacre et al15 to later in discussion of Rasch transformation of the FIM motor.
examine how closely our small sample related to the very large As previously discussed, Rasch techniques do not provide an
sample of Linacre. Correlation of combined admission and estimate for perfect or zero scores. This is a problem for compar-
discharge estimates yielded a Pearson’s r correlation of .87, ison of summed raw scores and Rasch estimates and for the
indicating substantial similarity between the item estimates. calculation of change. A value for perfect scores was ascertained
Sixteen percent of patients had a perfect score on the FIM by adding another variable to the Rasch analysis. The only
motor. Review of case estimates revealed only 5 cases with variable in the study that had less ceiling effect than the FIM
infit t values beyond 2.0 or ⫺2.0 that seriously misfit the motor was gait velocity (grouped as previously discussed). We
model. The mean case estimate was 2.34 ⫾ 1.56, indicating performed a further Rasch analysis with the FIM motor vari-
that the sample was biased toward cases with higher scores, as ables and gait velocity. Before this, we assessed the unidimen-
would be expected in the discharge scores of a selected reha- sionality of the combined FIM motor and gait velocity by using
bilitation sample. The item separation reliability51 was .85. principal components analysis. Two factors were revealed with
Although the Rasch modeling shows some areas of concern, eigenvalues of 7.1 and 1.7, with the gait and transfer items
the overall model fit justified our continuing with the analysis. loading heavily on the first factor and “eating” and “grooming”
Calculation of the ICC of each item’s level of difficulty at 2 loading heavily onto the second factor. The fit statistics of the
weeks poststroke and at discharge revealed an ICC of .87. Rasch analysis showed an infit mean square value of 1.4 for
Chang and Chan53 stipulated a cutoff of an ICC value of .90 as gait velocity. Although the fit of gait velocity to the FIM scale
an indication of the appropriateness of producing generalized was not optimum, it was considered reasonable to continue this
item difficulty estimates from scores at both occasions. Al- analysis because it was the most accurate means of obtaining
an estimate for perfect scores on the FIM motor.
The difference in Rasch case estimates between cases with
raw scores of 90 and 91 revealed that the step size between 90
and 91 in raw scores was equivalent to 11% of the Rasch
estimates. This finding indicates that a large improvement in
ability is required to move from a score of 90 to 91, compared
with an improvement of 1 point at other points of the scale.
Calculating a Rasch estimate for perfect FIM scores enables
more accurate comparison of Rasch estimates and summed raw
scores.
The relationship between the summed raw scores and Rasch
estimates at discharge displayed an ogival relationship at the
upper end (fig 2). The correlation between summed raw scores
and Rasch estimates was .92 at admission and .83 at discharge.
Rasch estimates and summed raw scores were compared as
measures of change. The correlation between change as
summed raw scores and Rasch estimates was .71 (fig 3). Figure
4 shows that for those subjects with low admission scores,
Fig 1. Item threshold estimates for FIM motor scores at discharge. summed raw scores tended to overstate the degree of change in
Abbreviations: Eat, eating; Groom, grooming; Bath, bathing; Dress
up, dressing the upper body; Dress low, dressing the lower body;
ability, and for subjects with high admission scores, the
Toilet, toileting; Bed tr, bed transfers; Toilet tr, toilet transfers; Tub summed raw scores tended to understate the degree of change
tr, tub transfers. in ability.

Arch Phys Med Rehabil Vol 83, January 2002


96 CHOOSING A DISCRIMINATIVE MEASURE, Brock

Fig 3. Summed raw scores and Rasch estimates for change in FIM Fig 5. Item threshold estimates for the MAS at discharge.
motor scores.

the MAS alone, with only 1 factor having an eigenvalue above


Motor Assessment Scale. Rasch analysis was performed 1.0. A Rasch analysis was performed, adding gait velocity to
on the discharge MAS scores. One item, sitting, had an infit the MAS. The fit statistics of the Rasch analysis showed an infit
mean square value above 1.3 (1.45). Figure 5 shows the esti- mean square value of .87 for gait velocity.
mated difficulty of each level of the 5 items. As with the FIM The difference in Rasch case estimates between cases scor-
motor, levels of items that are scored similarly in the MAS may ing 29 and 30 on the MAS was calculated and converted into
have markedly different weightings in the Rasch analysis. a percentage of the available range of Rasch estimates. Thus, it
Review of case estimates revealed only 2 cases with infit t was revealed that the difference between 95 and 100 in raw
values beyond 2.0 or ⫺2.0. The mean case estimate was 1.94 ⫾ scores was equivalent to 17% of the range of Rasch estimates.
1.60. The item separation reliability was .70. Cases with perfect scores were recoded to reflect this relation-
Calculation of the ICC of the estimates at 2 weeks poststroke ship, with a case estimate of 4.89. Rasch estimates were then
and discharge revealed an ICC of .77. This ICC is considerably recoded to range between 0 and 100 to facilitate comparison
lower than the cutoff of .90, stipulated by Chang and Chan53 as with summed raw scores.
an indication of the appropriateness of producing generalized The correlation between summed raw MAS scores and
item difficulty estimates from scores at both occasions. The Rasch estimates at discharge was .92. Although the correlation
lack of stability of the thresholds of the items over time is relatively high, summed raw scores understate the level of
precludes combining admission and discharge data into a single ability for the cases with the highest MAS scores.
data set, as was performed for the FIM motor. This finding Gait measures. Rasch analysis was performed on the gait
therefore, decreases the validity of the MAS Rasch estimates measures. The infit mean square values of the 3 items indicate
being used to measure change. For the present analysis, the that no items misfitted the model. Figure 6 shows the estimated
MAS was used as a measure of ability at discharge only. difficulty of each level of the 3 items. Ten cases had perfect
To obtain a value for subjects who had a perfect score, we scores at discharge. Review of case estimates revealed only 3
added another variable to the Rasch analysis. Gait velocity was cases with infit t values beyond 2.0 or ⫺2.0. The mean case esti-
the variable of choice, having less ceiling effect than the MAS. mate was 1.43 ⫾ 1.83. The item separation reliability was .83.
A further Rasch analysis was performed with the MAS and gait
velocity. Before this analysis, the unidimensionality of the Comparison of the Discriminative Properties of the
MAS with gait velocity was explored with principal compo- Functional Status Measures
nents analysis. The results were very similar to the results for
The discriminative properties of selected items from the FIM
motor, the MAS, and the gait measures were compared by

Fig 4. Summed raw scores and Rasch estimates of change in FIM


motor scores for randomly selected cases. (Cases are ordered ac-
cording to initial FIM motor score.) Fig 6. Item threshold estimates for gait measures.

Arch Phys Med Rehabil Vol 83, January 2002


CHOOSING A DISCRIMINATIVE MEASURE, Brock 97

Table 3: Items Chosen for Comparative Analysis Table 4: Distribution of Scores for FIM Motor Walking and Stairs

Measure Item FIM Walking FIM Stairs n*

FIM motor Stairs 7 7 29


Bathing 7 6 22
Tub transfer 6 6 19
Walking 6 5 6
MAS Walking 5 5 5
Sitting 5 4 0
Gait Velocity 4 or ⬍4 4 or ⬍4 9
Endurance
* For simplification, this table does not include 11 patients who had
more than a 2-point difference between the FIM walking and FIM
stairs item.

using Rasch analysis. Because the sample size of this study


precluded including too many items within any 1 analysis, we
considered only those items of the measures that had discrim- shows the distribution of scores for these 2 FIM motor items,
inated most effectively at higher levels of ability. We identified showing the capacity of these items to discriminate within the
the items with high-item thresholds in the Rasch analysis for sample. The FIM motor items of “bathing” and “tub transfer”
each measure and entered them into a combined Rasch analy- also differentiated among cases at this level. The MAS “walk-
sis. The items included in the combined Rasch analysis are ing” item (highest level threshold at 2.90 logits) discriminated
shown in table 3. nearly as well as the FIM motor, but the other items of the
Principal components analysis of the items revealed 2 factors MAS contributed little to the discriminative ability of the scale.
with eigenvalues above 1.0. These factors had eigenvalues of The endurance variable’s highest threshold level was 2.52
6.31 and 1.29, explaining 57.4% and 11.8% of the variance. logits, well below that of other variables.
Although the scale revealed 2 factors greater than 1, scree
testing showed that the second factor represented the point at DISCUSSION
which a line drawn through the points changed direction. Examining the spread of scores for the measures, we found
Subsequently, we continued with Rasch analysis because the wide variation in the ceiling effects. Those for endurance and
scale closely approaches a unidimensional scale. the FAC showed marked ceiling effects, with 39% and 46% of
Rasch analysis was performed on the combined gait and subjects achieving the highest scores. The ceiling effect of the
functional status measures at discharge. Two items were found FIM motor was much less pronounced, with only 16% achiev-
to misfit the model, MAS sitting (1.44) and endurance (1.39). ing a perfect score. The MAS was shown to have a mild ceiling
Two patients had perfect scores at discharge. Review of case effect, with 25% achieving a perfect score at discharge. The
estimates revealed 9 cases with infit t values beyond 2.0 or gait velocity variable showed minimal ceiling effects with only
⫺2.0. The mean case estimate was 2.46 ⫾ 2.35. The item 9% of patients achieving the highest level.
separation reliability was .90. Figure 7 shows the item thresh- In stroke rehabilitation, a small proportion of patients re-
old estimates for each of the items and the number of cases at cover with very little residual disability. Arguably, a ceiling
the upper scoring levels of each scale. effect of 16% of cases, to use the FIM motor as an example, is
Gait velocity showed the highest level of difficulty, at 4.84 an acceptable level because there may be little need to discrim-
logits (see fig 7). The next most difficult item thresholds, at inate among patients making a near perfect recovery. It is also
3.46 and 3.33 logits, were the next level down for walking possible that the disability in these cases fell in other domains,
velocity and the highest level of the FIM “stairs” item. The such as cognition and/or communication. If that is so, then
FIM motor items of “stairs” and “walking” provided a good there is little point in attempting to increase the discriminatory
spread of item thresholds at higher levels of ability. Table 4 power of a measure of physical disability.
However, examination of ceiling effects alone is insufficient
to show the suitability of a measure for evaluating higher level
outcomes. A measure may have a low ceiling effect, with
inclusion of a difficult item. The “step” size between the most
difficult and the next level down the scale is equally important.
If large gaps exist between the upper levels, the measure may
not discriminate adequately among the population. Rasch anal-
ysis investigates the “size of the steps.”
Rasch analysis showed that the FIM motor items represented
varying levels of difficulty. Step size tended to increase with
level of difficulty (see fig 1), and the step between 90 and 91
represented 11% of the scale. This value is similar to that
provided by Fiedler,55 in which the difference between 90 and
91 for summed raw scores represented 10% of the range of
Rasch estimates.
The relationship between the summed raw scores and Rasch
estimates at discharge displayed a nonlinear relationship, as
previously shown by Linacre et al.15 The upper levels of the
Fig 7. Item threshold estimates for the combined measures. For the
higher levels of each item, the number of patients with scores at
FIM motor require greater steps in level of ability to achieve
each level are provided, to facilitate comparing the discriminative the next level than are required at the lower levels. The corre-
abilities of items at the upper end of the spectrum of ability. lation between summed raw scores and Rasch estimates for

Arch Phys Med Rehabil Vol 83, January 2002


98 CHOOSING A DISCRIMINATIVE MEASURE, Brock

change was .71. Change in the lower part of the disability other options for intensive rehabilitation services have been
spectrum tends to be overstated by the summed raw scores and developed, such as “same day” rehabilitation (an intensive
change in the upper part of the disability spectrum tends to be outpatient rehabilitation program) and home-based rehabilita-
understated by the summed raw scores. Summed raw scores tion. Although the present data set is composed solely of
and Rasch estimates are not readily comparable for calculating inpatient data, it is arguable that outcome measurement should
change. To accurately represent level of ability or change in be performed at discharge from intensive rehabilitation, no
ability, Rasch estimates are preferable to summed raw scores. matter how those services are delivered.
Rasch analysis of the MAS at discharge showed acceptable
model fit. For this scale, the difference between the perfect CONCLUSION
score and the next level down occupied 17% of the range of
The FIM motor subscale proved to be the most suitable
Rasch case estimates. Summed raw scores and Rasch estimates
measure for evaluating mobility outcomes, as either level of
were compared for MAS at discharge, yielding a correlation of
ability at discharge, or as change in ability from admission to
.92. We found a nonlinear relationship for scores in the upper
discharge. Comparison of measures did not support the need to
end of the scale. The use of Rasch case estimates, in preference
add additional items to the FIM motor to prevent ceiling
to summed raw scores, is recommended for the MAS at dis-
effects. Rasch analysis of the FIM motor showed that summed
charge to evaluate accurately ability level.
raw scores did not show the characteristics of interval mea-
Comparing item thresholds for admission and discharge
sures. Small increments of change in raw scores at the higher
scores on the MAS revealed an ICC of .77. This correlation
levels of the scale, which may be interpreted as indicating little
was well below the cutoff of .90 recommended by Chang and
progress, actually denote far more substantial change. For an
Chan.53 This finding suggests that it is inadvisable to calculate
accurate indication of ability level, Rasch estimates should be
change scores from the MAS because the difficulty level of the
used, particularly for ability level at discharge and for the
items did not show stability over time. This is a major problem
calculation of change scores. The use of the conversion table of
for the use of this scale. Further development of the scale,
summed raw scores to Rasch estimates of ability calculated by
identifying and rectifying those item levels that do not maintain
Fiedler55 provides a simple and valid means of accurately
a constant level of difficulty at admission and discharge, may
comparing levels of ability at the end of intensive rehabilita-
improve the stability of the MAS.
tion. The present study supports the use of the FIM motor, with
Rasch analysis of gait velocity, endurance, and the FAC at
Rasch-transformed item difficulty and patient ability levels as
discharge showed acceptable model fit. The case estimates
a discriminative outcome measure of physical disability. The
developed provide a composite score representing the patient’s
use of this outcome evaluation model would permit bench-
walking ability in terms of speed, endurance, level of assistance
marking of stroke rehabilitation services according to standards
required, and ability to walk on stairs, inclines, and nonlevel
of outcome, as well as according to costs incurred in stroke
surfaces (FAC).
rehabilitation.
Comparing items from the FIM, MAS, and the gait measures
by using Rasch analysis, we found that gait velocity had the
References
strongest discriminative ability among patients with higher 1. Hungate RW. Purchaser quality measures: progressing from wants
level abilities. However, this single item does not extend to to needs. Jt Comm J Qual Improv 1994;20:381-6.
lower levels of ability. The item is not appropriate for patients 2. American Education Research Association and National Council
who cannot walk without assistance at discharge from rehabil- on Research in Education. Standards for educational and psycho-
itation. The next item to discriminate well among high-level logical tests. Washington (DC): American Psychological Associ-
cases was FIM “stairs.” This item requires patients to go up and ation; 1985.
down 12 to 14 stairs, without use of an aid or stair rail, safely 3. Center for Functional Assessment Research and the Uniform Data
and in a timely manner. That the FIM motor scale achieved the System for Medical Rehabilitation. Guide for use of the Uniform
third highest level of difficulty argues against a ceiling effect Data Set for Medical Rehabilitation including the Functional
Independence Measure (FIM) version 3.1. Buffalo (NY): State
for this scale. This finding adds weight to the contention by Univ New York; 1990.
Linacre15 that the ceiling effect of the FIM is apparent rather 4. Dodds TA, Martin DP, Stolov WC, Deyo RA. A validation of the
than real, and that use of Rasch converted scores clarifies the Functional Independence Measurement and its performance
true magnitude of the raw score differences. The 4 FIM motor among rehabilitation inpatients. Arch Phys Med Rehabil 1993;74:
items of “stairs,” “walking,” “bathing,” and “tub transfer” 531-6.
discriminated well throughout the spectrum of ability at dis- 5. Nissen MI. Emphasis placed by Cedar Court Physical Rehabili-
charge. tation Hospital on the functional independence measure as the
Although the MAS item of “walking” was a good discrim- overall index for successful rehabilitation. Aust Clin Rev 1989;9:
inator, the other variables of the MAS were less useful. The 36-8.
6. Pollock C, Freemantle N, Sheldon T, Song F. Methodological
threshold level of MAS “walking” was lower than the threshold difficulties in rehabilitation research. Clin Rehabil 1993;7:63-72.
level of the FIM “stairs” item. This is curious because both 7. Tsuji T, Sonada S, Domen K, Saitoh E, Liu M, Chino N. ADL
items require the patient to go up and down 12 stairs without a structure for stroke patients in Japan based on the Functional
rail or a walking aid, with a specific time limit applied to the Independence Measure. Am J Phys Med Rehabil 1995;74:432-8.
MAS. The main difference between the tests is that the MAS 8. Wade DT, Skilbeck CE, Hewer RL. Predicting Barthel ADL score
item is tested on a single occasion, whereas the FIM is rated on at 6 months after an acute stroke. Arch Phys Med Rehabil 1983;
24-hour performance, with safety as a consideration. Patients 64:24-8.
may successfully complete the MAS test under supervision, 9. Johnston MV, Findley TW, DeLuca J, Katz RT. Research in
though their therapist may not rate them as safe on stairs physical medicine and rehabilitation: XII. Measurement tools with
application to brain injury. Am J Phys Med Rehabil 1991;70:40-
without supervision. 56.
At the time of data collection in this study (1993–1995), 10. Merbitz C, Morris J, Grip JC. Ordinal scales and foundations of
most rehabilitation services in Australia were delivered in the misinference. Arch Phys Med Rehabil 1989;70:308-12.
inpatient setting. Long hospital LOS for the more severely 11. Silverstein B, Fisher WP, Kilgore KM, Harley JP, Harvey RF.
disabled stroke patients were not uncommon. In recent years, Applying psychometric criteria to functional assessment in med-

Arch Phys Med Rehabil Vol 83, January 2002


CHOOSING A DISCRIMINATIVE MEASURE, Brock 99

ical rehabilitation: II. Defining interval measures. Arch Phys Med repeated measurements of temporal and distance parameters of
Rehabil 1992;73:507-18. gait following stroke. Arch Phys Med Rehabil 1997;78:725-9.
12. Wright BD, Linacre JM. Observations are always ordinal; mea- 35. Wade DT, Wood VA, Heller A, Maggs J, Hewer RL. Walking
surements, however, must be interval. Arch Phys Med Rehabil after stroke. Scand J Rehabil Med 1987;19:25-30.
1989;70:857-60. 36. Collen FM, Wade DT, Bradshaw CM. Mobility after stroke:
13. Fisher AG, Bryze KA, Granger CV, et al. Applications of conjoint reliability of measures of impairment and disability. Int Disabil
measurement to the development of functional assessments. Int J Stud 1990;12:6-9.
Educ Res 1994;21:579-93. 37. Bohannon RW. Gait performance of hemiparetic stroke patients:
14. Heinemann AW, Linacre JM, Wright BD, Hamilton BB, Granger selected variables. Arch Phys Med Rehabil 1987;68:777-81.
C. Relationships between impairment and physical disability as 38. Brandstater ME, de Bruin H, Gowland C, Clark BM. Hemiplegic
measured by the Functional Independence Measure. Arch Phys gait: analysis of temporal variables. Arch Phys Med Rehabil
Med Rehabil 1993;74:566-73. 1983;64:583-7.
15. Linacre JM, Heinemann AW, Wright BD, Granger CV, Hamilton 39. Corcoran PJ, Jebsen RH, Brengelmann GL, Simons BC. Effects of
BB. The structure and stability of the Functional Independence plastic and metal leg braces on speed and energy cost of hemipa-
Measure. Arch Phys Med Rehabil 1994;75:127-32. retic ambulation. Arch Phys Med Rehabil 1970;51:69-77.
16. Wright BD, Linacre JM, Heinemann AW. Measuring functional 40. Dettman MA, Linder MT, Sepic SB. Relationships among walk-
status in rehabilitation. Phys Med Rehabil North Am 1993;4:475- ing performance, postural stability, and functional assessments of
91. the hemiplegic patient. Am J Phys Med Rehabil 1987;66:77-90.
17. Carr JH, Shepherd RB, Nordholm L, Lynne D. Investigation of a 41. Lehmann JF, Condon SM, Price R, deLateur BJ. Gait abnormal-
new motor assessment scale for stroke patients. Phys Ther 1985; ities in hemiplegia: their correction by ankle foot orthoses. Arch
65:175-9. Phys Med Rehabil 1987;68:763-71.
18. Holden MK, Gill KM, Magliozzi MR, Nathan J, Piehj-Baker L. 42. Friedman PJ. Gait recovery after hemiplegic stroke. Int Disabil
Clinical gait assessment in the neurologically impaired: reliability Stud 1991;12:119-22.
and meaningfulness. Phys Ther 1986;64:1530-9. 43. Richards CL, Malouin F, Dumas F, Tardif D. Gait velocity as an
19. Ottenbacher KJ, Hsu Y, Granger CV, Fiedler RC. The reliability outcome measure of locomotor recovery after stroke. In: Craik
of the Functional Independence Measure: a quantitative review. RL, Oatis CA, editors. Gait analysis: theory and application. St
Arch Phys Med Rehabil 1996;77:1226-32. Louis: Mosby; 1995. p 355-64.
20. Stineman MG, Jette A, Fiedler R, Granger C. Impairment-specific 44. Goldie PA, Matyas TA, Evans OM. Deficit and change in gait
dimensions within the Functional Independence Measure. Arch velocity during rehabilitation after stroke. Arch Phys Med Rehabil
Phys Med Rehabil 1997;78:636-43. 1996;77:1074-82.
21. Stineman MG, Shea JA, Jette A, et al. The Functional Indepen- 45. Hill K, Ellis P, Bernhardt J, Maggs P, Hull S. Balance and
dence Measure: tests of scaling assumptions, structure, and reli- mobility outcomes for stroke patients: a comprehensive audit.
ability across 20 diverse impairment categories. Arch Phys Med Aust J Physiother 1997;43:173-80.
Rehabil 1996;77:1101-8. 46. Velozo CA, Kielhofner G, Lai J. The use of Rasch analysis to
22. Granger CV, Cotter AC, Hamilton BB, Fiedler RC. Functional produce scale-free measurement of functional ability. Am J Occup
assessment scales: a study of persons after stroke. Arch Phys Med Ther 1999;53:83-90.
Rehabil 1993;74:133-8. 47. Fisher WP, Harvey RF, Taylor P, Kilgore KM, Kelly CK. Reha-
23. Oczkowski WJ, Barreca S. The Functional Independence Mea- bits: a common language of functional assessment. Arch Phys
sure: its use to identify rehabilitation needs in stroke survivors. Med Rehabil 1995;76:113-22.
Arch Phys Med Rehabil 1993;74:1291-4. 48. Grimby G, Andren E, Daving Y, Wright B. Dependence and
24. Heinemann AW, Linacre JM, Wright BD, Hamilton BB, Granger perceived difficulty in daily activities in community living stroke
C. Prediction of rehabilitation outcomes with disability measures. survivors two years after stroke. Stroke 1998;29:1843-9.
Arch Phys Med Rehabil 1994;75:133-43. 49. Silverstein B, Kilgore KM, Fisher WP, Harley JP, Harvey RF.
25. Stineman MG, Escarce JJ, Goin JE, Hamilton BB, Granger DV, Applying psychometric criteria to functional assessment in med-
Williams SV. A case-mix classification system for medical reha- ical rehabilitation: 1. Exploring unidimensionality. Arch Phys
bilitation. Med Care 1994;32:366-79. Med Rehabil 1991;72:631-7.
26. Stineman MG, Goin JE, Granger CV, Fiedler R, Williams SV. 50. Weiss DJ, Yoes ME. Item response theory. In: Hambleton RK,
Discharge motor FIM–function related groups. Arch Phys Med Zaal JN, editors. Advances in educational and psychological test-
Rehabil 1997;78:980-5. ing: theory and applications. Boston: Kluwer Academic; 1991. p
27. Loewen SC, Anderson BA. Reliability of the Modified Motor 69-95.
Assessment Scale and the Barthel Index. Phys Ther 1988;68: 51. Wright BD, Masters GN. Rating scale analysis. Chicago: Mesa Pr;
1077-81. 1982.
28. Poole JL, Whitney SL. Motor assessment scale for stroke patients: 52. Grimby G, Gudjonsson G, Rodhe M, Sunnerhagen KS, Sundh V,
concurrent validity and interrater reliability. Arch Phys Med Re- Ostensson ML. The Functional Independence Measure in Sweden:
habil 1988;69:195-7. experience for outcome measurement in rehabilitation medicine.
29. Malouin F, Pichard L, Bonneau C, Durand A, Corriveau D. Scand J Rehabil Med 1996;28:52-62.
Evaluating motor recovery early after stroke: comparison of the 53. Chang W-C, Chan C. Rasch analysis for outcomes measures:
Fugl-Meyer assessment and the Motor Assessment Scale. Arch some methodological considerations. Arch Phys Med Rehabil
Phys Med Rehabil 1994;75:1206-12. 1995;76:934-9.
30. Loewen SC, Anderson BA. Predictors of stroke outcome using 54. McGinley J. Criteria for community ambulation: implications for
objective measurement scales. Stroke 1990;21:78-81. stroke rehabilitation [dissertation]. Melbourne (Aust): La Trobe
31. Holden MK, Gill KM, Magliozzi MR. Gait assessment for neu- Univ; 1991.
rologically impaired patients. Phys Ther 1984;66:35-40. 55. Fiedler R. The Rasch measurement model. UDS Update 1993;7:
32. Roth EJ, Merbitz CT, Grip JC, et al. The timer-logger-communi- 1-10.
cator gait monitor: recording temporal gait parameters using a
portable computerized device. Int Disabil Stud 1990;12:10-6. Suppliers
33. Mayo NE, Wood-Dauphinee SA, Gordon C, Higgins J, McEwen a. SPSS Inc, 233 S Wacker Dr, 11th F1, Chicago, IL 60606.
S, Salbach N. Disablement following stroke. Disabil Rehabil b. Adams R. Khoo S. QUEST: the interactive test analysis system.
1999;21:258-68. Melbourne (Aust): Australian Council for Educational Research;
34. Evans M, Goldie P, Hill K. Systematic and random error in 1994.

Arch Phys Med Rehabil Vol 83, January 2002

Das könnte Ihnen auch gefallen