Reproducibility of Physiological Track-And-trigger Warning Systems For Identifying At-Risk Patients On The Ward

Intensive Care Med (2007) 33:619624
DOI 10.1007/s00134-006-0516-8
Christian P. Subbe
Haiyan Gao
David A. Harrison
Received: 21 April 2006

Accepted: 17 October 2006
Published online: 18 January 2007
Springer-Verlag 2007
Electronic supplementary material
The online version of this article
(doi:10.1007/s00134-006-0516-8) contains
supplementary material, which is available
to authorized users.
C. P. Subbe
Wrexham Maelor Hospital,
Department of Medicine,
Wrexham LL13 4TX, UK
H. Gao D. A. Harrison (u)
Tavistock House, Intensive Care National
Audit and Research Centre,
Tavistock Square, London WC1H 9HR, UK
e-mail: david.harrison@icnarc.org
Tel.: +44-20-73882856
Fax: +44-20-73883759
ORIGINAL
Reproducibility of physiological
track-and-trigger warning systems
for identifying at-risk patients on the ward
Abstract Objective: Physiological

track-and-trigger warning systems
are used to identify patients on acute
wards at risk of deterioration, as early
as possible. The objective of this
study was to assess the inter-rater and
intra-rater reliability of the physiological measurements, aggregate scores
and triggering events of three such
systems. Design: Prospective cohort
study. Setting: General medical and
surgical wards in one non-university
acute hospital. Patients and participants: Unselected ward patients:
114 patients in the inter-rater study
and 45 patients in the intra-rater
study were examined by four raters.
Measurements and results: Physiological observations obtained at the
bedside were evaluated using three
systems: the medical emergency team
call-out criteria (MET); the modified
early warning score (MEWS); and
the assessment score of sick-patient
identification and step-up in treatment
(ASSIST). Inter-rater and intra-rater
reliability were assessed by intraclass correlation coefficients, kappa
Introduction
Physiological track-and-trigger warning systems are used
to identify patients on acute wards at risk of deterioration, as early as possible. There are three main types in
use [1]:
1. Single- and multiple-parameter systems identify patients by comparing bedside observations with a simple
statistics and percentage agreement.

There was fair to moderate agreement on most physiological parameters, and fair agreement on the scores,
but better levels of agreement on triggers. Reliability was partially a function of simplicity: MET achieved
a higher percentage of agreement
than ASSIST, and ASSIST higher
than MEWS. Intra-rater reliability
was better then inter-rater reliability.
Using corrected calculations improved the level of inter-rater agreement but not intra-rater agreement.
Conclusion: There was significant
variation in the reproducibility of
different track-and-trigger warning
systems. The systems examined
showed better levels of agreement
on triggers than on aggregate scores.
Simpler systems had better reliability.
Inter-rater agreement might improve
by using electronic calculations of
scores.
Keywords Observer variation Reproducibility of results Critical
illness Scoring systems
set of criteria and indicating whether one or more of

the parameters has reached predefined thresholds.
2. Aggregate weighted scoring systems allocate a weight
to each observation as a function of its abnormality and
a summary score is derived.
3. Combination systems are the combination of single- or
multiple-parameter systems with aggregate weighted
scoring systems.
620
Single-parameter systems have been used extensively by

Australian medical emergency teams (MET) [2]. Multipleparameter, aggregate weighted scoring and combination
systems are mainly in use in U.K. hospital settings. A survey of acute hospitals in England indicated that most hospitals were using aggregate weighted scoring systems [1].
Interventional studies have shown that the use of track and
trigger systems may reduce adverse outcomes in medical
and surgical patients [2, 3, 4, 5, 6]. It is not known whether
these systems are reproducible since no information on
inter- and intra-rater reliability has been published.
In this study, inter-rater and intra-rater reliability
of the physiological measurements, aggregate scores
and triggering events of three systems were examined:
a single-parameter system, the call-out criteria for
MET [2]; and two aggregate scoring systems, the modified
early warning score (MEWS) [6] and the assessment score
for sick-patient identification and step-up in treatment
(ASSIST) [7].
Methods
Design and data collection
A prospective observational study was conducted at
Wrexham Maelor Hospital, a district general hospital
in North Wales. The study was approved by the local
research ethics committee. Participants were adult patients
from general medical and surgical wards. A number of
wards were selected to satisfy the sample size calculation
(below) and all patients on these wards able to give
informed consent were invited to participate. Patients were
informed about the purpose of the study and received an
information leaflet. Verbal consent was obtained.
Based on assumptions for inter-rater reliability
(kappa = 0.8, proportion of positive results = 0.07) with
four raters, a sample of 93 patients was required to
estimate kappa with a standard error of 0.1. For the intrarater reliability, with an assumed value of kappa = 0.9,
the required sample size was 44 patients. Sample size
calculations were performed using a custom-designed
module [8].
Data were collected by four members of hospital staff
on 3 days. All four raters were familiar with the scoring
methods in their clinical practice and received an induction
prior to the study. Two investigators prepared the consent
and patient identification data prior to the study.
For inter-rater reliability, data were collected on two
acute medical and two acute surgical wards. A senior doctor (Certificate of Completion of Specialist Training equivalent in Intensive Care Medicine), junior doctor (Senior
House Officer level), registered nurse (E-grade; 5 years of
experience) and student nurse, who had previously worked
as a health care assistant (nursing auxiliary), collected the
data. The order of the raters taking the measurements was
randomized for each ward from a set of possible permutations. Raters were blinded to the results of their colleagues.
For the intra-rater study, the same raters examined separate
patients from one medical and one surgical ward, examining the same patients four times each in 15-min intervals,
blinded to their previous scores. There were no interventions between the four sets of measurements.
Age and normal blood pressure, derived from an
average of the previous 48 h, were collected first. Raters
then measured the remaining parameters: systolic blood
pressure; temperature; respiratory rate; pulse rate; and
level of consciousness. Blood pressure was measured
electronically (Dinamap, Critikon, Tampa, Fla.) and
checked manually where appropriate. Blood pressure was
measured by all four raters on the first 18 patients, but the
repeated measurement was found to be unacceptable to
the patients. For subsequent patients, blood pressure was
measured only once, noted on the patients bedside sheet,
and copied by subsequent raters. Temperature was taken
orally (Temp-PlusII, IVAC Corp., San Diego, Calif.),
measured only once, noted and copied by subsequent
raters. All other parameters were measured by each rater
in turn. Pulse rate was counted over 15 s in regular heart
rhythm and 1 min in irregular heart rhythm; respiratory
rate was counted over 30 s. Raters calculated urine output
per kilogram and hour from the output over the last 4 h.
Raters scored the observations according to the three
systems. The MET criteria were scored as one if any
criterion was fulfilled and otherwise as zero. The MEWS
and ASSIST were scored according to scoring charts.
Blood pressure in MEWS was scored differently from
the published scoring method, by deviation from the
patients norm (C. Stenhouse, pers. commun.). Details of
the scoring systems, including the modification to MEWS,
are contained in the Electronic Supplementary Material.
Data were entered into a spreadsheet by a data-entry
clerk not involved in data collection. Logic, range and consistency check were applied to all variables. Outliers and
missing data were checked against original data collection
sheets.
Statistical analysis
Statistical analysis was performed using intra-class correlation coefficients for continuous variables (systolic
blood pressure, heart rate, respiratory rate, temperature
and aggregate scores), and kappa statistics for categorical
variables (conscious level, trigger events and aggregate
scores). Two-way and one-way analysis of variance was
used in calculating the intra-class correlation coefficient
for inter-rater and intra-rater studies, respectively [9].
Bootstrap methods were used to provide bias-corrected
confidence intervals. For the inter-rater study, we also
calculated kappa and phi statistics [10] for each of the
six possible pairings among the raters. All analyses were
621
performed in Stata 8.2 (StataCorp LP, College Station,

Texas).
Disagreements in total scores and trigger events could
be a result of disagreements in physiological measurements or incorrect calculation. To examine the relative
impact of these disagreements, the three systems were recalculated from the original measurements, and agreement
was assessed both on the scores as recorded by the raters
and on the corrected scores.
To interpret the strength of agreement, we adopted
the following guidelines [11]: < 0.20 poor; 0.210.40
fair; 0.410.60 moderate; 0.610.80 good; 0.811.00 very
good.
Results
Inter-rater reliability
In the inter-rater study, 114 patients were examined. The
four raters were not able to perform four sets of measurements on all 114 patients, as some patients were called for
clinical investigations or were otherwise unavailable. In total, 433 sets of measurements were obtained.
Nine sets of observations from three patients were excluded as their normal blood pressures were missing, leaving 424 sets of observations included in the study. One
hundred nine, 102, 107 and 106 patients were examined,
respectively, by the senior doctor, junior doctor, registered
Table 1 Number of correctly

calculated scores for inter-rater
study. MET call-out criteria for
medical emergency teams,
MEWS modified early warning
score, ASSIST assessment score
for sick-patient identification and
step-up in treatment
Observations, n
MET, n (%)
MEWS, n (%)
ASSIST, n (%)
nurse and student nurse. Of the 424 sets of observations,

412 (97.1%) were missing urine output. All other parameters were 100% complete. Urine output was therefore excluded from the analysis.
By allowing raters to copy the temperature and blood
pressure (apart from the first 18 patients), there is potential
to have introduced errors in copying the figures. Copying
errors in temperature and blood pressure were identified in
1.4% and 0.6% of observations, respectively.
Agreement on respiratory rate, heart rate and systolic blood pressure (method 1: blood pressure of 18
patients taken by each rater) was similar, with intra-class
correlation coefficient (95% confidence interval) 0.57
(0.450.70), 0.63 (0.520.73) and 0.65 (0.400.85),
respectively. Copying errors had almost no effect on
the agreement on systolic blood pressure (method 2:
blood pressure of 96 patients taken once and copied)
and intra-class correlation coefficient 0.99 (0.971.00),
and only a small effect on temperature, intra-class
correlation coefficient 0.74 (0.510.91). There were
no significant differences in the mean physiological
measurements among the raters for respiratory rate
(p = 0.44), systolic blood pressure (p = 0.34 and 0.09 by
methods 1 and 2, respectively), and heart rate (p = 0.23).
The small number of copying errors in temperature,
predominantly by one rater, where agreement was otherwise perfect, led to a small but significant difference
in mean temperature (p = 0.03). Kappa agreement was
moderate (0.53, 95% confidence interval 0.310.78)
Student
nurse
Registered
nurse
Junior
doctor
Senior
doctor
Total
p-value
106
98 (92.5)
81 (76.4)
78 (73.6)
107
106 (99.1)
81 (75.7)
90 (84.1)
102
101 (99.0)
92 (90.2)
86 (84.3)
109
109 (100)
94 (86.2)
90 (82.6)
424
414 (97.6)
348 (82.1)
344 (81.1)
0.001
0.01
0.15
The p-value indicates statistical significance of difference in correctly calculated scores among raters
Table 2 Level of agreement of aggregate scores and triggers among the four raters for inter-rater study
Calculated by raters
MET trigger
MEWS score
MEWS trigger
ASSIST score
ASSIST trigger
Corrected calculations
MET trigger
MEWS score
MEWS trigger
ASSIST score
Triggered, n (%)/
score, median (interquartile range) [range]
Kappa statistic (95%

confidence interval)
All agreed,
n (%)
Three agreed,
n (%)
Intra-class correlation
coefficient (95% confidence interval)
11 (2.6)
1 (1, 2) [0, 8]
60 (14.2)
1 (0, 1) [0, 8]
19 (4.5)
0.03(0.05, 0.00)
0.20 (0.13, 0.27)
0.18 (0.09, 0.27)
0.46 (0.38, 0.55)
0.20 (0.04, 0.38)
86 (77.5)
17 (15.3)
62 (55.9)
41 (36.9)
84 (75.7)
106 (95.5)
53 (47.8)
94 (84.7)
80 (72.1)
104 (93.7)
0.45 (0.34, 0.55)
0.49 (0.40, 0.57)
7 (1.7)
1 (1, 2) [0, 8]
69 (16.3)
1 (0, 2) [0, 8]
0.02(0.04, 0.05)
0.22 (0.15, 0.30)
0.37 (0.25, 0.51)
0.50 (0.42, 0.58)
90 (81.1)
18 (16.2)
64 (57.7)
43 (38.7)
106 (95.5)
55 (49.6)
101 (91.0)
83 (74.8)
0.50 (0.42, 0.59)
0.66 (0.55, 0.76)
622
on levels of consciousness used in MEWS and fair

(0.35, 0.220.48) on levels of consciousness used in
ASSIST.
The percentage of correctly calculated scores was
lower for MEWS and ASSIST than for MET (Table 1).
Overall, 27 (6.4%) patients were scored higher and 49
(11.5%) lower than correct scores for MEWS, and 12
(2.8%) patients were scored higher and 67 (15.8%) lower
than correct scores for ASSIST. There were statistically
significant differences in the percentage of correctly
calculated scores among raters for MET and MEWS.
The agreement indices among the four raters (Table 2)
suggest that the raters had a higher level of agreement on
aggregate score for ASSIST than for MEWS. There were
no significant differences among the raters in mean scores
for the two systems (p = 0.40 and 0.13 for MEWS and ASSIST calculated by raters, 0.41 and 0.14 corrected). The
distributions of MEWS and ASSIST scores for the four
raters are shown in the Electronic Supplementary Material.
Agreement on triggers was similar in MEWS and ASSIST, and was improved by using corrected scores. Percentage agreement on triggers was higher than scores. In
MET, any patient who did not trigger the first three criteria but caused seriously worry was scored as one. In the
424 sets of observations, 5 patients were triggered via this
criterion, all by one rater.
Pairwise agreements were similar to overall agreement,
and agreement using phi appeared better than kappa (see
Electronic Supplementary Material).
Table 3 Number of correctly

calculated scores for intra-rater
study
Observations, n
MET, n (%)
MEWS, n (%)
ASSIST, n (%)
Intra-rater reliability
There were 180 sets of observations from 45 patients in the
intra-rater study. All observations were used in the analyses. In total, 170 (94.4%) were missing in urine output,
which was excluded. Other parameters were 100% complete. There were copying errors for temperature in 0.6%
of observations and for blood pressure in 1.1%.
There was 100% agreement on conscious level, with
all patients scored as Alert. Intra-rater agreements on
respiratory rate, heart rate and systolic blood pressure
were similar to those of the inter-rater study. Agreement
on temperature and intra-class correlation coefficient 0.98
(0.941.00) was better in the intra-rater study than the
inter-rater study.
The proportions of scores calculated correctly were
similar to those from the inter-rater study (Table 3). In
MET, patients were 100% correctly scored by all raters.
In MEWS, 17 (9.4%) patients were scored higher and
14 (7.8%) lower than correct scores, and in ASSIST 11
(6.1%) patients were scored higher and 22 (12.2%) lower
than correct scores.
The agreement indices (Table 4) suggest intra-rater
agreement on score was similar for MEWS and ASSIST.
There was good agreement on triggers for MEWS and
ASSIST, although the confidence intervals for ASSIST
were very wide due to the low number of events. Only
1 patient triggered the MET calling criteria on a single
observation.
Student
nurse
Registered
nurse
Junior
doctor
Senior
doctor
Total
p-value
48
48 (100)
40 (83.3)
33 (68.8)
24
24 (100)
24 (100)
24 (100)
84
84 (100)
66 (78.6)
72 (85.7)
24
24 (100)
19 (79.2)
18 (75.0)
180
180 (100)
149 (82.8)
147 (81.7)
1
0.05
0.003
The p-value indicates statistical significance of difference in correctly calculated scores among raters
Table 4 Level of agreement of total scores and triggers among the four raters for intra-rater study
Calculated by raters
MET trigger
MEWS score
MEWS trigger
ASSIST score
ASSIST trigger
Corrected calculations
MET trigger
MEWS score
MEWS trigger
ASSIST score
ASSIST trigger
Triggered, n (%)/
score, median (interquartile range) [range]
Kappa statistic (95%

confidence interval)
All agreed,
n (%)
Three agreed,
n (%)
Intra-class correlation
coefficient (95% confidence interval)
1 (0.6)
1 (1, 2) [0, 6]
26 (14.4)
1 (1, 1) [0, 5]
6 (3.3)
0.01(0.02, 0.01)
0.53 (0.39, 0.68)
0.64 (0.46, 0.84)
0.59 (0.46, 0.74)
0.66 (0.02, 1.00)
44 (97.8)
24 (53.3)
37 (82.2)
27 (60.0)
43 (95.6)
45 (100)
37 (82.2)
45 (100)
40 (88.9)
45 (100)
0.71 (0.60, 0.76)
0.81 (0.58, 0.93)
1 (0.6)
1 (1, 2) [0, 5]
23 (12.8)
1 (1, 1) [0, 5]
8 (4.4)
0.01(0.02, 0.01)
0.56 (0.42, 0.68)
0.58 (0.31, 0.81)
0.54 (0.42, 0.68)
0.48 (0.03, 1.00)
44 (97.8)
23 (51.1)
37 (82.2)
25 (55.6)
41 (91.1)
45 (100)
37 (82.2)
44 (97.8)
35 (77.8)
45 (100)
0.68 (0.53, 0.75)
0.57 (0.24, 0.83)
623
Discussion
Scoring systems such as the ones used in this study have
become an important tool of clinical risk management for
critically ill patients on general wards. Thus far, it is not
known whether these assessments are reproducible and
how large the likely errors are if different members of staff
perform what is meant to be an identical assessment. In the
present study we have provided some data on how three
systems used in the U.K. perform. There was only fair to
moderate agreement on measurements of the parameters
used to generate the scores, and only fair agreement on
the scores. Reassuringly, there was better percentage
agreement on the decision whether a patient had triggered
or not.
As one would expect, reproducibility was partially
a function of simplicity: MET achieved higher percentage
agreement than ASSIST, and ASSIST higher than MEWS.
Intra-rater reliability was better then inter-rater reliability.
Using corrected calculations improved the level of interrater agreement but not intra-rater agreement, suggesting
that if scoring systems were misapplied, each rater was
doing so in a consistent manner.
The systems were selected because they represent three
levels of complexity. MET is very simple but does not
allow a patients progress to be tracked. MEWS is a complete assessment that takes into account urine output and
relative changes in blood pressure as compared with previous measurements. ASSIST is a simplified version with
only four parameters and an age constant. Both ASSIST
and MEWS allow monitoring of clinical progress. The
chosen systems are representative of the wide range of
scoring systems currently in use, but any system should be
assessed in the setting where it is used.
There were a number of potential weaknesses in this
study. Firstly, repeated measurements were taken within
an hour, but it is possible that patients could have deteriorated or improved during this time. We did not assess
whether there was systematic drift of figures between measurements.
A small number of patients were not able or willing to
give consent. In particular, patients with reduced neurological function (approximately 5% of all patients) could not
be included, and were likely to be generally sicker patients.
Inclusion might have led to different results with regard
to reliability of the trigger mechanism; however, abnormal
neurological scores have been found to be rare in previous
studies [3, 12].
It was our aim to assess the reliability of the scoring process in clinical practice. The reliability depends
partially on the reliability of the electronic measurement devices used for blood pressure and temperature.
This could not be assessed directly as repeated measurement was unacceptable to the patients. Our results
therefore represent the human element of reliability
only.
Different scores might perform better in different

scenarios. As MET and ASSIST collect only basic information, they might be appropriate for screening a large
population. MEWS includes two further important pieces
of physiological information; however, identifying old
records to assess relative changes in blood pressure is
unlikely to be performed reliably in a large number of
patients. MEWS is therefore probably better suited as
a monitoring tool for pre-selected patients known to be
at high risk of catastrophic deterioration. In addition,
as raters were not familiar with details of the patients
condition, the trigger criterion for any patient who caused
serious worry was almost certainly underutilized.
Kappa is a chance-corrected measure of agreement,
expressed as a fraction of the maximum difference
between observed and expected agreements. Negative
values indicate that observed agreement was lower than
expected by chance. As trigger events with MET were
very rare, expected agreement was extremely high. Kappa
is largely meaningless for events this rare, and the chanceindependent measure phi can only assess agreement
between two raters.
Problems with reproducibility are common when assessing both bedside physiological measurements [13] and
scoring systems [14, 15, 16, 17, 18, 19]. Intra- and interrater variability for APACHE II scores have been reported
at 1015% [14, 15] but may be reduced if data are collected
by highly trained experts [15]. Despite problems with reliability of scoring components, the main target measure
might be largely unaffected [16, 17]. This corresponds to
the finding that there was greater agreement on the presence of a trigger event than values of scores.
This study was not designed to assess whether the systems helped to identify critically ill patients on general
wards. Differences in reliability should be taken into account when choosing a score and the clinical area and patient group to which it will be applied. Determinants of
reliability in different professional groups need further investigation.
Conclusion
There was significant variation in the reproducibility of
physiological track-and-trigger warning systems used by
different health care professionals. All three systems examined showed better agreement on triggers than aggregate scores. Simpler systems had better reliability. Further
research should examine how reliability can be improved.
Acknowledgements. This study was funded by the UK National
Health Service Research and Development Service Delivery and
Organisation Programme (SDO/74/2004). The authors thank
S. Ameeth, S. Collins, K. Ghosh, C. Rincon and J. Tobler for
their help in preparing the study, obtaining consent from patients
and collecting the data. We thank A. Pawley for entering data into
electronic format and L. Gemmell for advising on the format and
facilitating the setup of the study.
624
References
1. Department of Health and NHS Modernisation Agency (2003) The National
Outreach Report. Department of Health,
London
2. Lee A, Bishop G, Hilman K, Daffurn K
(1995) The medical emergency team.
Anaesth Intensive Care 23:183186
3. Subbe CP, Kruger M, Rutherford P,
Gemmell L (2001) Patients at risk:
validation of a modified Early Warning Score in medical admissions.
Q J Med 94:521526
4. Buist MD, Moore GE, Bernard SA,
Waxman BP, Anderson JN, Nguyen TV
(2002) Effects of a medical emergency
team on reduction of incidence of and
mortality from unexpected cardiac
arrests in hospital: preliminary study.
Br Med J 324:387390
5. Pittard AJ (2003) Out of our reach?
Assessing the impact of introducing a critical care outreach service.
Anaesthesia 58:882885
6. Stenhouse C, Coates S, Tivey M,
Allsop P, Parker T (2000) Prospective
evaluation of a Modified Early Warning
Score to aid earlier detection of patients
developing critical illness on a general
surgical ward. Br J Anaesth 84:663P
7. Subbe CP, Hibbs R, Williams E,
Rutherford P, Gemmel L (2002) ASSIST: a screening tool for critically
ill patients on general medical wards.
Intensive Care Med 28:S21
8. Harrison D (2004) KAPUTIL: Stata

module to generate confidence intervals
and sample size calculations for the
kappa-statistic. Statistical Software
Components S446501, Boston College
Department of Economics. (Available
at http://ideas.repec.org/c/boc/bocode/
s446501.html)
9. Armitage P, Berry G, Matthews JNS
(2002) Statistical methods in medical research. Blackwell, Oxford, pp
704707
10. Meade MO, Cook RJ, Guyatt GH,
Groll R, Kachura JR, Bedard M,
Cook DJ, Slutsky AS, Stewart TE (2000) Interobserver variation in interpreting chest radiographs for the diagnosis of acute
respiratory distress syndrome.
Am J Respir Crit Care Med 161:8590
11. Altman D (1991) Practical statistics for
medical research. Chapman and Hall,
London, pp 401409
12. Subbe CP, Davies RG, Williams E,
Rutherford P, Gemmell L (2003) Effect
of introducing the Modified Early
Warning score on clinical outcomes,
cardio-pulmonary arrests and intensive care utilisation in acute medical
admissions. Anaesthesia 58:797802
13. Giuliano KK, Scott SS, Elliot S,
Giuliano AJ (1999) Temperature
measurement in critically ill orally
intubated adults: a comparison of pulmonary artery core, tympanic, and oral
methods. Crit Care Med 27:21882193
14. Polderman KH, Christiaans HMT,

Wester JP, Spijkstra JJ, Girbes ARJ
(2001) Intra-observer variability in
APACHE II scoring. Intensive Care
Med 27:15501552
15. Polderman KH, Jorna EMF, Girbes ARJ
(2001) Inter-observer variability in
APACHE II scoring: effect of strict
guidelines and training. Intensive Care
Med 27:13651369
16. Chen LM, Martin CM, Morrison TL,
Sibbald WJ (1999) Interobserver
variability in data collection of
the APACHE II score in teaching and community hospitals.
Crit Care Med 27:19992004
17. Ru M, Valero C, Quintana S,
Artigas A, lvarez M (2000) Interobserver variability of the measurement of the mortality probability models (MPM II) in the
assessment of severity of illness.
Intensive Care Med 26:286291
18. Lefering R, Zart M, Neugebauer EAM
(2000) Retrospective evaluation
of the simplified therapeutic intervention scoring system (TISS-28)
in a surgical intensive care unit.
Intensive Care Med 26:17941802
19. Arts DGT, de Keizer NF, Vroom MB,
de Jonge E (2005) Reliability and
accuracy of Sequential Organ Failure
Assessment (SOFA) scoring. Crit Care
Med 33:19881993

Reproducibility of Physiological Track-And-trigger Warning Systems For Identifying At-Risk Patients On The Ward

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Reproducibility of Physiological Track-And-trigger Warning Systems For Identifying At-Risk Patients On The Ward

Hochgeladen von

Copyright:

Verfügbare Formate

Intensive Care Med (2007) 33:619624

Received: 21 April 2006

Abstract Objective: Physiological

statistics and percentage agreement.

set of criteria and indicating whether one or more of

Single-parameter systems have been used extensively by

performed in Stata 8.2 (StataCorp LP, College Station,

Table 1 Number of correctly

nurse and student nurse. Of the 424 sets of observations,

Kappa statistic (95%

0.45 (0.34, 0.55)

0.49 (0.40, 0.57)

0.50 (0.42, 0.59)

0.66 (0.55, 0.76)

on levels of consciousness used in MEWS and fair

Table 3 Number of correctly

Kappa statistic (95%

0.71 (0.60, 0.76)

0.81 (0.58, 0.93)

0.68 (0.53, 0.75)

0.57 (0.24, 0.83)

Different scores might perform better in different

8. Harrison D (2004) KAPUTIL: Stata

14. Polderman KH, Christiaans HMT,

Das könnte Ihnen auch gefallen