Sie sind auf Seite 1von 2

Dissections DIAGNOSIS

23 May 2009
Evidence-based Medicine for Surgeons

Diagnosing ruptured appendicitis preoperatively in pediatric patients


Authors: Williams RF, Blakely ML, Fischer PE, et al
Journal: J American College of Surgeons 2009; 208:819–828
Centre: Division of Pediatric Surgery, University of Tennessee Health Science Center, Memphis, TN, USA
The rate of ruptured appendicitis (RA) is higher in children than in adults (30% to 74%). While
treatment for acute appendicitis (AA) consists predominantly of urgent appendectomy, treatment
for RA has much greater variation. Whether urgent appendectomy or initial antibiotics with
BACKGROUND interval appendectomy should be the preferred treatment for children with RA remains
controversial. Distinguishing ruptured from acute appendicitis is very important if the treatment
differs for the two conditions. The pediatric surgeon’s ability to distinguish these two conditions
preoperatively has not been prospectively studied.
Authors' claim(s): “...Pediatric surgeons differentiate AA from RA and not
RESEARCH QUESTION appendicitis preoperatively with high accuracy and sensitivity [ability to rule in],
but the specificity [ability to rule out] for diagnosing ruptured appendicitis is
Population
lower. The scoring system improved the specificity of the preoperative
All patients younger than 18 years diagnosis.”
referred for abdominal pain at a
regional children’s hospital.
IN SUMMARY
Indicator variable Diagnostic performance of pediatric surgeons
Patients diagnosed as having acute Acute appendictis Ruptured Not appendictis
appendicitis (AA), ruptured (AA) appendictis (RA) (NA)
appendictis (RA) or no appendictis.
Number 98 53 96
Outcome variable As seen in the study
Sensitivity, specificity, positive and Sensitivity 92.60% 96.40% 98.70%
negative likelihood ratios for the
diagnosis of ruptured appendicitis. Specificity 94.90% 83.00% 93.80%

Comparison Negative likelihood ratio * 0.05 0.21 0.06

Diagnostic accuracy of a derived Applying scoring system Details on following page


scoring system applied to the data Sensitivity 47.00%
from the same cohort.
Specificity 98.00%
* = Measure of true negatives (ruling out)

THE TISSUE REPORT


The literature is rife with reports of scoring systems that are extracted from existing data, put through the magic black box of
a linear regression analysis, and retrospectively revalidated on the cohort from which they were extracted. Bad science. In
addition, the authors of this paper do a lot of mumbling and beating around the bush when discussing the results of
application of the scoring system. From the data published, contradictory ideas emerge regarding the overall accuracy (see
the table above). A lot of fuzzy pie-splitting is indulged in. The only way to demonstrate the validity of this scoring system
would have been to use two groups of identically competent pediatric surgeons and in a randomized fashion, allocate
patients to one group that relied only on their conventional diagnostic skills and the other that used only the score and show
that the latter were better. Moreover, why do we need an elaborate study to be told that generalized abdominal tenderness
and an abscess seen on CT scan are the strongest predictors of a ruptured appendix?!

EBM-O-METER
Evidence level Overall rating Bias levels
Double blind RCT Sampling
Randomized controlled trial (RCT) Comparison
Trash Swiss Safe News-
Prospective cohort study - not randomized cheese worthy Measurement
Life's too Holds water
short for this Full of holes “Just do it”
Case controlled study
Interestingl | Novel l | Feasible l
Case series - retrospective  Ethical l | Resource saving l

The devil is in the details (more on the paper) ... 

© Dr Arjun Rajagopalan
SAMPLING
Sample type Inclusion criteria Exclusion criteria Final score card
Simple random All patients < 18 yrs Not stated AA RA NA
of age referred for
Stratified random Target ? ? ?
evaluation of
Cluster abdominal pain  Accessible ? ? ?
Consecutive Intended ? ? ?
Convenience Drop outs ? ? ?
Judgmental Study 98 53 96

 = Reasonable | ? = Arguable |  = Questionable


Duration of the study: February 2007 to October 2007

Sampling bias: The authors have not provided any details of the sampling process. The study was done in a
referral pediatric hospital.

COMPARISON
Randomized Case-control Non-random Historical None

Controls - details
Allocation details The pediatric surgical team recorded an agreed initial (preoperative) diagnosis using all data
available. The use of advanced imaging (CT or ultrasonography) was decided by emergency
department physicians, referral physicians, or pediatric surgeons. Using the predictors
identified with multivariable analysis, a scoring system was constructed to evaluate whether
an objective score based on available data might improve the ability to accurately diagnose
RA.
Comparability -
Disparity -

Comparison bias: The scoring system was derived from existing data, put through a linear regression analysis and
retrospectively revalidated on the cohort from which they were extracted. This is not a valid comparison.

MEASUREMENT
Measurement error
Device used Device error Observer error
Gold std.

Device suited to task


Training

Scoring

Blinding
Repetition

Protocols

Y ? N

1.Final diagnosis - clinical team ? N ? N N N N


2. Scoring system for RA (see table below) ? N N N Y Y N

Final diagnosis was determined using operative findings, pathology reports, or discharge diagnosis in those not
undergoing operation. Final diagnosis in patients who did not undergo an operation was confirmed with follow up
telephone contact and follow up review of the electronic medical record aimed at identifying care received after the
initial discharge.

Variable Points Univariable analysis was performed on all preoperative variables


comparing patients with a discharge diagnosis of RA to those with
Generalized tenderness 4
AA. Using the predictors identified with multivariable analysis, a
Abscess on CT 3 scoring system was constructed as shown alongside. The patient’s
score was calculated by adding the appropriate points based on the
Duration > 48 hrs 3 number of significant preoperative variables present. The results of
WBC count > 19,400/ ml 2 this scoring system, as reported in the study, are difficult to
interpret. No detailed explanation is given for why the authors
Fecalith on CT 1 believe that specificity is improved after applying the score.

Measurement bias: Only 79% of patients in the study had a CT scan. No attempt was made to measure observer
variability.
© Dr Arjun Rajagopalan

Das könnte Ihnen auch gefallen