Sie sind auf Seite 1von 5

RELIABILITY

OF

DETECTION

OF

LUMBAR LATERAL SHIFT

Helen A. Clare, MAppSc,a Roger Adams, PhD,b and Christopher G. Maher, PhDc

ABSTRACT
Background and Purpose: The poor reliability of lateral shift detection has been attributed to lack of rater

training, biologic variation, and test reactivity. This study aimed to remove the potential confounding arising from
biological variation and test reactivity and control the level of rater experience/training in making judgments of lateral
shift.
Subjects: One hundred forty-eight raters with 3 levels of clinical physical therapy experience and training in the
McKenzie method participated.
Method: The raters viewed photographic slides of 45 patients with low back pain. Slides were judged on a
numerical scale for presence and direction of a shift. Intrarater reliability was evaluated using the intraclass
correlation coefficient (ICC) and interrater reliability was evaluated using both the ICC and statistic.
Results: Reliability of shift judgments was only moderate for all groups (eg, ICC [2,1] values ranged from 0.48 to
0.64).
Conclusion: Lateral shift judgements have only moderate reliability, even when trained raters judge stable stimuli.
We propose that the photo model employed can be used to explore the source of error in this process. (J
Manipulative Physiol Ther 2003;26:476-80)
Key Indexing Terms: Low Back Pain; Lumbar Spine; Lateral Shift; Reliability of Testing; McKenzie Method

INTRODUCTION
recent survey of physical therapists in the United
States1 reported that a McKenzie evaluation was
one of the most common evaluations performed for
patients with low back pain (LBP) and that almost half the
therapists viewed the McKenzie method as the most useful
management approach for low back pain. Similar results
have been reported for British and Irish physiotherapists.2,3
The method has received support as an effective LBP
treatment in a systematic review of activity prescription for
back pain4 and also in Danish clinical practice guidelines,5
based on the 2 existing clinical trials.6,7 Subsequent to the
completion of both reviews, Cherkin et al8 published a
clinical trial evaluating 3 approaches (chiropractic manipulation, McKenzie therapy, and an educational booklet) and

Private practice of physiotherapy, Sydney, Australia, and PhD


candidate, School of Physiotherapy, The University of Sydney,
Sydney, Australia.
b
Senior lecturer, School of Physiotherapy, The University of
Sydney, Sydney, Australia.
c
Associate Professor, School of Physiotherapy, University of
Sydney, Sydney, Australia.
Submit requests for reprints to: Helen Clare, PT, MAppSc, 16
Ayres Road, St Ives NSW 2075, Sydney, Australia (e-mail:
clare@magna.com.au).
Paper submitted June 6, 2002.
Copyright 2003 by National University of Health Sciences.
0161-4754/2003/$30.00 0
doi:10.1067/S0161-4754(03)00104-0

476

found chiropractic manipulation and McKenzie therapy to


have similar effects and costs. However, both treatments
provided only marginally better outcomes than an educational booklet.8 In this environment, further information
about the use of the basic criteria in the method is needed.
The principal aim of the McKenzie assessment is to first
determine those suitable for treatment with this approach.
Suitable patients must fit one of 3 syndromes: postural,
dysfunction, or derangement.9 The derangement syndrome
is further divided into 7 subsyndromes on the basis of pain
location, the behavior of the pain in response to the application of repeated spinal movements, and on the presence or
absence of deformities including a lateral shift. Because
classification determines the specific treatment used by the
treating clinician, accurate classification is believed essential for the effective management of the LBP patient.
In employing the method, the presence of a lateral shift is
determined by visual inspection at the time the patients
posture is evaluated. If a lateral shift is deemed to be
present, lateral glide movements are performed to assess if
these alter the patient symptoms. Where this is the case, the
shift is classified as relevant and directs the initial treatment approach.9 The initial step of detecting a shift is of
paramount importance, because it is only if a shift is identified to be present that its relevance is determined.
A lateral shift is defined as a lateral displacement of the
trunk in relation to the pelvis.9 The prevalence of a lateral

Journal of Manipulative and Physiological Therapeutics


Volume 26, Number 8

shift has proved hard to establish, probably because of the


problems with measurement of this attribute. Porter and
Miller10 suggest that it is an uncommon feature, citing a
prevalence of 5.6%; however, later studies report approximate prevalences of 20%11 and 80%.12
The reliability of therapists in determining the presence
of a lateral shift has been evaluated in 6 studies to date. In
the Kilby et al13 study, 2 physiotherapists with some training in the McKenzie method simultaneously evaluated 41
patients. There was only 55% agreement on the presence or
absence of a lateral shift, a value similar to the findings of
Nelson et al,14 who reported that the detection of lumbar tilt
(lateral shift) had high interobserver error. However, these
studies did not provide values, and there was insufficient
data to allow calculation of this statistic.
Riddle and Rothstein11 examined the intertester reliability of assessments of LBP patients made by physical therapists using the McKenzie method. They also aimed to
determine whether training in the McKenzie method influenced reliability. Forty-nine physical therapists from 8 clinics examined 363 patients. Sixteen of the therapists had
attended at least 1 postgraduate course in the McKenzie
method. The paired assessments were completed consecutively, with a time interval between examinations. They
found a high error rate in the determination of the presence
of a lateral shift (60% agreement, 0.26) and concluded
that this was a possible source of error in the determination
of the syndrome classifications.
Donahue et al12 attempted to improve the reliability of the
determination of the presence and direction of a lateral shift
by using a simple measuring device, but the reported
value for the decisions indicated very poor reliability.
McLean et al15 investigated 3 different techniques for measuring trunk list and concluded that the use of a plumb line
provided the most reliable measures; however, there was no
summary reliability statistic reported to allow comparison to
other studies.
Improved reliability in determining the presence of a
lateral shift (78% agreement, 0.52) was demonstrated
by Razmjou et al16 for therapists observing the same patient
assessment. The 2 physical therapists involved in this study
were both trained extensively in the McKenzie method and
assessed the patients simultaneously in an attempt to reduce
the error related to repeated examinations. They visually
determined whether a lateral shift deformity was present for
each patient.
Based on the research to date, it remains unclear whether
a lateral shift can be detected with acceptable reliability.
The measuring devices used to date do not seem to improve
reliability, and the reliability estimates are in the range poor
to moderate. It is therefore worthwhile to explore the source
of disagreement.
Two hypotheses have been offered for the poor reliability
observed:

Clare, Adams, and Maher


Reliability of Shift Detection

The attribute is inherently unstable and changes with


repeated examination.16
2. The attribute is subtle, and clinical experience and
training are necessary to reliably measure a lateral
shift.17
One way to explore the first hypothesis is to use a model
of clinical practice that allows for greater control than
would be possible in the clinic, for example, the use of
photographs as the stimuli to be rated rather than real
patients. This method avoids the potentially confounding
effect of the biologic variation of the shift, allows for an
unlimited number of repetitions of the same stimuli, and
also allows for a much larger panel of raters than is practical
in a traditional clinical reliability study.
To explore the effect of clinical experience and training,
we selected a cross section of raters, including first-year
undergraduate students, graduate physiotherapists with no
formal training in the McKenzie method, and graduate
physiotherapists with a minimum of 70 hours training in the
McKenzie method.
The aims of the study were to investigate:
the intrarater/interrater reliability of judgements of lateral shift made from inspection of photographs of
patients with low back pain.
whether interrater reliability and discriminability were
influenced by level of education in the McKenzie
method.
1.

METHOD
Project Overview
The design of the experiment required raters to inspect a
set of photographic slides of patients with low back pain and
to judge whether a shift was present. The photographs of the
patients had been taken by the first author on the same day
that she performed a full clinical examination of these
patients. On the same visit. demographic and clinical data
were recorded for each patient.

Subjects
Patients with low back pain. Patients attending a private physiotherapy clinic for low back pain were invited to participate
in the study. The criteria for inclusion were that they were
currently experiencing low back pain with or without radiation to the leg.
All subjects gave written consent prior to participating.
Information was collected from the subjects regarding their
gender, age, weight, height, location of symptoms, duration
of symptoms, working status, previous history of LBP, pain
intensity, frequency, and functional status (Table 1).
Raters. The raters consisted of:
60 first-year undergraduate physical therapy students
with no clinical experience or training in the McKenzie
method.

477

478

Clare, Adams, and Maher


Reliability of Shift Detection

Journal of Manipulative and Physiological Therapeutics


October 2003

Table 1. Subject characteristics


Characteristic
Number of subjects
Age (y)
Height (cm)
Weight (kg)
Pain intensity (VAS cm)
Quebec Disability score
Female gender
Past LBP
Frequency of pain (% constant)
Duration of symptoms
Acute (7 days)
Subacute (7 days - 7 weeks)
Chronic (7 weeks)
Radiation into leg
Radiation below the knee
Working normal duties

45
50.6 (14)
164 (12)
73 (15)
5.6 (1.7)
47.3 (19)
58%
87%
51%
18%
31%
51%
56%
27%
44%

Data for continuous variables are mean values with SDs in parentheses,
categorical variables are percentages.

46 graduate physical therapists with some clinical experience but no formal training in the McKenzie
method.
42 graduate physical therapists who had clinical experience and had completed a minimum of 70 hours of
formal training in the McKenzie method.

Procedure
Investigator HC conducted a complete clinical examination of each patient and then asked the patient to stand
within a doorway with their back toward a camera (Cannon
EOS 3000 88, Tokyo, Japan). The camera was placed on a
tripod 3 meters from the subjects, set on auto focus, and was
able to be activated from a distance. A photograph was then
immediately taken. The patient then resumed their normal
treatment.
The photographs were converted into slides and duplicates were made, which resulted in 90 slides of the 45
patients. These were randomly positioned in a slide tray so
that the order of the second set of slides varied from the first
set. The slides were shown to the 3 sets of raters. The
instructions given to the raters were that they were to
determine the presence or not of a lumbar lateral shift. They
were read the following:
McKenzie 1981 defines a lateral shift as when the top half of
the patients body has moved laterally in relation to the bottom
half.

The assessors were provided with a data collection form


and were instructed that for each subject slide they were
required to make 2 determinations. The first determination
consisted of 1 of 3 choices: left lateral shift present, shift
absent, right lateral shift present. The second determination
required them to indicate the level of certainty of the first

determination by rating it either certain or uncertain. The


assessors were instructed not to share their views about each
slide with others. The data sheets were collected, and the
information was entered for analysis.

Data Analysis
Reliability of detecting a shift. The raters judgements were converted to a 5-point scale of confidence that the patient had a
right shift: 2 certain shifted to the left; 1 uncertain
shifted to the left; 0 neutral; 1 uncertain shifted to the
right; 2 certain shifted to the right. Intrarater reliability
was determined by comparing the judgments of lateral shifts
of the first presentation of the 45 subjects with the second
presentation. This was performed for all raters.
Intrarater and interrater reliability were evaluated by calculating the intraclass correlation coefficient (ICC [2,1]) for
each group, using the SPSS Macro ICCSF2.SPS within
SPSS 10.0 (SPSS, Chicago, Ill). This analysis considers the
data as continuous data, whereas others may consider the
data to represent ordinal data. However, the argument is
unnecessary, as Fleiss and Cohen18 have shown that
weighted and the ICC are equivalent. To allow comparison with other studies that have evaluated reliability with ,
we calculated the multirater (an unweighted form of )
using the MKAPPASC.SPS macro in SPSS 10.0.
The intrarater reliability (ICC 2,1) for each subject was
determined and then a group mean value and 95% CI for the
group was determined. This was done for each of the 3
groups. The interrater reliability (as determined by the ICC
and ) were calculated for each of the 3 groups. Ninety-five
percent CIs for each statistic were calculated.

RESULTS
Intrarater reliabilities, as expressed by ICC values with
95% CIs, are shown in Table 2. The ICC values ranged from
0.48 to 0.59, which falls within the range of ICC values
described by Fleiss19 as representing fair to good reliability.
The interrater reliabilities, expressed as ICC values, ranged
from 0.49 to 0.64, again in the range representing fair to
good reliability. For both intrarater and interrater reliability,
inspection of the 95% CIs reveals that the McKenzie group
had statistically significantly greater reliability than the
other groups. The (Table 2) ranged from 0.26 to 0.38,
again suggesting fair reliability.20

DISCUSSION
Despite using a simplified model of clinical practice that
removed any potential for reactivity and biologic variation,
the reliability of shift detection remained unacceptably low.
While the McKenzie trained raters were more reliable in
judging a shift than the other 2 groups of raters, the absolute
difference between groups was small and was revealed as
statistically significant because of the high power of the

Journal of Manipulative and Physiological Therapeutics


Volume 26, Number 8

Clare, Adams, and Maher


Reliability of Shift Detection

Table 2. Intrarater and interrater reliability of shift judgements


Interrater

Intrarater*
Raters

ICC

ICC

Kappa

First-year students
Graduate physical therapist
McKenzie trained physical therapists

0.56 (0.53-0.59)
0.48 (0.43-0.53)
0.59 (0.55-0.63)

0.53 (0.46-0.61)
0.49 (0.42-0.51)
0.64 (0.57-0.71)

0.36 (0.35-0.37)
0.26 (0.25-0.27)
0.38 (0.37-0.39)

ICC, intraclass correlation coefficient.


*Group mean value and 95% CI of the point estimate ICC for each subject.

Point estimate and 95% CI for a single ICC or Kappa that compares multiple raters.

study. Our study had unusually high power because we used


a model that allowed for a large rater pool (range 42-60
raters), whereas the typical reliability study has 2 raters. As
for the difference in ICC values not being large in absolute
terms, the highest value still did not reach the benchmark for
excellent reliability (0.75) suggested by Fleiss.19 Our result
is consistent with other studies11,12 that have noted major
problems with the detection of lumbar shifts.
Subsequent to completing this study, an additional study
has been published that has evaluated the reliability of shift
detection.21 Interestingly, they reported a similar value for
agreement on detecting the presence ( 0.2) and direction
of a lateral shift ( 0.4) between 2 experienced McKenzie
trained physical therapists.
Associates also alerted us to an earlier study that similarly
used slides of patients as the stimuli to be rated.22 In
contrast to all other reliability studies, the authors reported
perfect agreement in judging lateral shifts. We are unable to
offer an explanation for this result.
Without further investigations to determine which cues
are influencing the decision of the raters, we are unable to
provide an explanation for the difficulty in determining the
presence of a lateral shift. However, with the model that we
utilized in this study, we could rigorously evaluate factors
such as training, anthropometric variables of the stimuli,
and visual accuracy of the raters. We would view this
endeavor as similar to the one we embarked on 7 years ago
when we reported similarly low reliability for physical
therapists judgements of lumbar posteroanterior spinal
stiffness. A series of studies has led us to a greater understanding of the issue, and we have now developed a protocol
that allows physical therapists to accurately judge stiffness.23

CONCLUSION
Despite the task of judging the presence or absence of a
lateral shift being simplified by the removal of biologic
variation and test reactivity, the reliability of the raters in
this study was unacceptable. We recommend that this model
utilizing photographs of LBP patients be used to further
study the features of the lateral shift that influence the raters
decision as to its presence and direction. Once these have been

established, a protocol may be able to be developed to improve


the reliability of detection of the lateral shift.

ACKNOWLEDGMENTS
This study was approved by the Human Research Ethics
Committee of the University of Sydney.

REFERENCES
1. Battie MC, Cherkin DC, Dunn R, Ciol MA, Wheeler K.
Managing low-back pain: attitudes and treatment preferences
of physical therapists. Phys Ther 1994;74:219-26.
2. Foster NE, Thompson KA, Baxter GD, Allen JM. Management of nonspecific low-back pain by physiotherapists in
Britain and Ireland. Spine 1999;24:1332-42.
3. Hurly DA, Dusior TE, McDonough SM, Moore AP, Linton
SJ, Baxter GD. Biopsychosocial screening questionnaire for
patients with low-back pain: preliminary report of utility in
physiotherapy practice in Northern Ireland. Clin J Pain 2000;
16:214-28.
4. Maher C, Latimer J, Refshauge K. Prescription of activity for
low-back pain: what works? Aust J Physiother 1999;45:12132.
5. Danish Institute for Health and Technology Assessment. Lowback pain. Frequency, management and prevention from an
HITA perspective. Danish Health Technol Assess 1999;1:1106.
6. Stankovic R, Johnell O. Conservative treatment of acute lowback pain. A 5-year follow-up study of two methods of treatment. Spine 1995;20:469-72.
7. Nwuga G, Nwuga V. Relative therapeutic efficacy of the
Williams and McKenzie protocols in back pain management.
Physiother Pract 1985;1:99-105.
8. Cherkin D, Deyo R, Battie M, Street J, Barlow W. A comparison of physical therapy, chiropractic manipulation, and
provision of an educational booklet for the treatment of patients with low-back pain. N Engl J Med 1998;339:1021-29.
9. McKenzie RA. The lumbar spine: mechanical diagnosis and
therapy. Waikanae, New Zealand: Spinal Publication Limited;
1981. p. 24-6.
10. Porter RW, Miller CG. Back pain and trunk list. Spine 1986;
11:596-600.
11. Riddle DL, Rothstein JM. Intertester reliability of McKenzies
classifications of the syndrome types present in patients with
low-back pain. Spine 1993;18:1333-44.
12. Donahue MS, Riddle DL, Sulivan MS. Intertester reliability of
a modified version of McKenzies lateral shift assessments
obtained on patients with low-back pain. Phys Ther 1996;76:
706-16.

479

480

Clare, Adams, and Maher


Reliability of Shift Detection

13. Kilby J, Stigant M, Robert A. The reliability of back pain


assessment by physiotherapists, using a McKenzie algorithm. Physiotherapy 1990;76:579-83.
14. Nelson MA, Allen P, Clamp S, DeDombal F. Reliability and
reproducibility of clinical findings in low-back pain. Spine
1979;4:97-101.
15. McLean IP, Gillan MG, Ross JC, Aspden RM, Porter RW. A
comparison of methods for measuring trunk list. Spine 1996;
21:1667-70.
16. Razmjou H, Kramer JF, Yamada R. Intertester reliability of
the McKenzie evaluation in assessing patients with mechanical low-back pain. J Orthop Sports Phys Ther 2000;30:368-89.
17. Donelson R. Letter to editor. Spine 1994;19:1414.
18. Fleiss JL, Cohen J. The equivalence of weighted kappa and the
intraclass correlation coefficient as measures of reliability.
Educ Psychol Meas 1973;71:505-13.

Journal of Manipulative and Physiological Therapeutics


October 2003

19. Fleiss J. The design and analysis of clinical experiments. New


York: John Wiley; 1986. p. 1-32.
20. Landis J, Koch G. The measurement of observer agreement
for categorical data. Biometrics 1977;33:159-74.
21. Kilpikoski S, Airaksinen MD, Kankaanpaa M, Leminen P,
Videman T, Alen M. Interexaminer relaibility of low-back
pain assessment using the McKenzie method. Spine 2002;27:
E207-E214.
22. Tenhula J, Rose S, Delitto A. Association between direction of
lateral lumbar shift, movement tests, and side of symptoms in
patients with low-back pain syndrome. Phys Ther 1990;70:
480-85.
23. Chiradejnant A, Maher C, Latimer J. Objective manual assessment of lumbar PA stiffness is now possible. J Manipulative Physiol Ther 2003;26:34-9.

Das könnte Ihnen auch gefallen