Critical Appraisal Process: Step-by-Step

Article in Southern medical journal · March 2012

DOI: 10.1097/SMJ.0b013e31824a711f


Review Article

Critical Appraisal Process: Step-by-Step

Donna F. Timm, MLS, Daniel E. Banks, MD, MS, and Jerry McLarty, PhD

2. Analysis methodology:
Abstract: We present information describing how to search to iden- How was the analysis done? Was it appropriate?
tify those reports that provide insight into the answer to the query. Was subgroup analysis done?
We have presented a reasonable approach to searching, with our end- Was secondary analysis done?
point being the identification of published articles which appear to Did the analysis adjust for multiple testing?
answer our queries. The decision as to whether these articles are ap- 3. Interpretation of results:
plicable to the patient under discussion is determined by our clinical Did the study answer the research question?
Are the conclusions supported by the data? Is the argument
knowledge and the specifics of the patient’s medical concerns. This
process is recognized as critical analysis. Our structure for optimal What questions did the study not address?
searching includes use of the PICO model, formulating a focused What new questions does the study raise?
clinical question, and defining key search terms. Using these princi- Is there possible investigator/author bias?
ples, we have addressed an example important controversy in the Who financed the study, are there potential conflicts of interest?
practice of clinical medicine; in other words, the effectiveness of How did the authors interact with the study financiers?
screening for prostate cancer and whether it alters the natural history
of this illness. The same principles apply whether a study under re-
view is a small clinical trial, a meta-analysis of a series of stud-
Key Words: computerized searching, evidence-based medicine
ies attempting to answer a singular question, or a large-scale
population-based study.
We recognize that the optimal education of our students
O ne of the six competencies that has been accepted as a part
of postgraduate medical education is practice-based learn-
ing and improvement. Practically speaking, our students are ex-
in evidence-based medicine (EBM) ideally requires a team. It
involves the physician who is competent at critical appraisal
pected to learn the skills necessary to allow them to keep up with and can focus on developing the correct question, and who
the changes in medicine and to practice state-of-the-art medicine possesses the clinical judgment to determine whether the con-
as their careers progress. In this article, a step-by-step overview clusion reached can be reasonably applied to the patient; a
of the critical appraisal process is examined, including an ex- librarian who can aid the physician in searching for references
planation of a multidisciplinary approach and demonstration that answer the query; and a statistician who has the skills to
of practical steps in obtaining and appraising clinical studies. explain the specifics of the tests.
We then review an example of a controversial clinical question
investigated using this approach. The factors to be considered
in critical appraisal are as follows:
1. Study design:
What is the research question? Key Points
How were patients identified and recruited? & Although the clinician may be competent and able to search
Who was included? Who was excluded? individually, the librarian with skills in structuring the search
Randomization methodology: How was equality of groups ensured? and the statistician with a fuller understanding of the tests used
Was the sample size adequate to answer the research question? to attain the results can add immeasurably to the process.
& The critical appraisal process is complex. The cornerstones
of evaluating the content of an article begin with questions
Review Article

Roles of the Physician, Librarian, expert opinion but also on the evidence that matters most to
and Statistician patientsVthat which has the possibility to offer additional treat-
Physicians have long been aware of the need for devel- ment options, the possibility of a better outcome, and a chance
oping skills in searching the literature. Older doctors recall the to improve their quality of life. In the EBM process, the medi-
monthly issues of the Index Medicus supplements that updated cal literature is no substitute for clinical judgment; it is supple-
the list of recent medical publications and the bulky bound vol- mental to or supportive of the physician’s clinical judgment.
umes of the supplements that were assembled at the end of each
year. Searching for publications addressing a specific clinical
Evaluating the Quality of Studies
There are numerous critical appraisal worksheets that can
question was often laborious, requiring a review of a series of
be used to evaluate the quality of various study types. The work-
monthly and yearly volumes of the Index Medicus. Computeriza-
sheets are freely available from various sources. Worksheets that
tion of this process by the National Library of Medicine brought
we have found particularly helpful originated from the Duke
new challenges. Although many physicians knew the question
University Medical Center Library.6 The process can become
they wanted answered, they were now required to become com-
somewhat complex as a series of worksheets are available to
puter literate and to learn the techniques of computer searching.
Enter the medical librarian: initially a support for physi- assess critical appraisal for therapy, diagnosis, prognosis,
cians, but increasingly a partner in the EBM process.1 The role harm, systematic reviews, practice guidelines, differential di-
of the medical librarian has changed dramatically during the last agnoses, economic analysis, and qualitative study; however,
decade. Once, they managed the written collection, but now each worksheet can assist the user in evaluating the validity
they are required to demonstrate competence at computerized of a study based upon criteria unique to each study type.
searching of the literature and in teaching care providers this In addition to the critical appraisal worksheets, there are
evidence-based approach to searching and critical appraisal. scales for rating the quality of primary studies before conduct-
Clinicians have become competent in the clinical process, and ing a meta-analysis. The Jadad scale7,8 assesses the methods
their goal is to make a difference in patient care.2 Librarians used for randomization, double blinding, and providing follow-
work with physicians to help formulate a clearly structured clini- up of dropouts and those who withdraw from the intervention
cal question. The more precise the question, the easier it is to group. Each study is scored on the basis of a point system from
identify key terms to be included in a search strategy to retrieve 1 to 5, with a score of e2 indicating a study of low quality
relevant results. Questions are developed based on the PICO and Q3 indicating a study of high quality. The Newcastle-
model: patient, intervention, comparison(s), and outcome(s).3 Ottawa scale9 rates various criteria for observational studies.
Once the clinical question is formulated, key search terms are This scale uses a ‘‘star’’system, in which each study is evaluated
identified and a search strategy developed. The clinician can on the criteria of study group selection, group comparability, and
select from a broad number of databases or resources to search, ascertainment of the exposure or outcome of case-control or
depending upon the type of information needed. For example, cohort studies. Also taken into consideration are the overall
for more general background questions, students may search for study design, the content, and how easily the study can be
information in online textbooks or in databases such as UpTo- incorporated into a meta-analysis. Stars are assigned to rate
Date or DynaMed. For more specific, or foreground, questions, each criterion within each category. The drawback of this scale
searching MEDLINE or using the PubMed or Ovid interfaces is that a threshold score has not been defined to distinguish
may be the best choice. The ‘‘best’’ available evidence may be in between higher- and lower-quality studies. In most instances,
the form of a meta-analysis, systematic review, practice guide- we use the critical appraisal worksheets when we are interested
line, randomized controlled trial, or observational research, such in ‘‘formally’’ evaluating the studies. The Jadad and Newcastle-
as cohort studies, case-control studies, and case reports.4 Ottawa scales typically are more useful to researchers who are
The next step is to evaluate each study for validity and writing meta-analyses or undertaking systematic reviews.
applicability to patient care. The librarian’s role in the critical
appraisal process is to point students toward tools that help Example: The Role of Screening
them evaluate each study type. When the critical appraisal pro- in the Detection of Prostatic Cancer
cess is complete, the clinician must decide whether there is Screening for early detection of prostate cancer may be
sufficient convincing evidence to support decisions that change one of the most contentious topics in primary care medicine.
patient care. Statisticians can aid the physician in understand- The crux of the disagreement is the costYbenefit analysis: pro-
ing the meaning of statistical tests, often providing insight into state cancer often can be detected at a curable stage with the mea-
the reasonableness of the information as it may apply to the surement of prostate-specific antigen (PSA) and digital rectal
specific clinical problem and even to his or her patient and his examination (DRE). Some studies have found a screening-
or her set of unique comorbidities. related decreased mortality from prostate cancer, but screening
The goal, of course, is to make EBM ‘‘doable in every- had no apparent effect on all-cause mortality. Large-scale ran-
day practice.’’5 The very definition of EBM includes clini- domized trials have studied the long-term risks and benefits of
cal knowledge. Physicians base their decisions not only on prostate cancer screening. A clinician may formulate a search

Critical Appraisal Process: Step-by-Step

strategy by completing the PICO model (Table) with a focused tion of diagnosis/screening, in which case, the search may look
clinical question, key search terms, and MEDLINE searches. like MeSH example 3: ‘‘Prostatic Neoplasms/Diagnosis’’ [Majr]
Limits: Humans, Practice Guideline.
Focused Clinical Question
On the basis of the information shown in the Table, the Critical Appraisal of the Retrieved Articles
searcher may pose the following clinical question: In men
50 years or older, is prostate cancer screening effective for Andriole GL, Crawford ED, Grubb RL 3rd, et al. Mortal-
the early detection of prostate cancer, resulting in a decline in ity results from a randomized prostate-cancer screening trial.
prostate cancer mortality? N Engl J Med 2009;360:1310Y1319
This study, a part of the Prostate, Lung, Colorectal and Ovar-
Key Search Terms ian (PLCO) Cancer Screening Trial sponsored by the National
Cancer Institute, recruited 76,693 men between 55 and 74 years
The key terms on which the searcher should focus are old (currently the American Cancer Society recommends that
prostate cancer, screening, diagnosis, and prevention. The men between 50 and 74 years old be offered a prostate screening
student should then try those terms in the medical subject examination). Screening in the PLCO study differed in two sig-
headings (MeSH) database to determine to which medical nificant ways from European Randomized Study on Screening
subject headings those terms will map. for Prostate Cancer (ERSPC) screening. Both PSA and DRE and
was performed annually. Indication for biopsy was a PSA level
Sample MEDLINE Searches Q4.0 ng/mL or a DRE suspicious for cancer.
Based on this clearly focused question, the following are Schroder FH, Hugosson J, Roobol MJ, et al. Screening
two possible search strategies, using MeSH and searching and prostate-cancer mortality in a randomized European study.
MEDLINE through the PubMed interface: MeSH example 1: N Engl J Med 2009;360:1320Y1328
‘‘Prostatic Neoplasms’’ [Majr] Limits: Humans, Practice Guide- The primary question to be answered in the ERSPC trial
line, and MeSH example 2: ‘‘Prostatic Neoplasms/Prevention and was simple: Does PSA-based screening reduce mortality from
Control ’’[MeSH Terms] Limits: Humans, Meta-Analysis. It is prostate cancer? A secondary question asked was whether pro-
important for the clinician to remember to apply limits to the state screening affects overall mortality. The ERSPC is actually
search for more targeted results. Because this particular search a collection of trials from seven countries in Europe with dif-
involves a question of etiology/harm, the ideal study design is ferent eligibility criteria, randomization schemes, and plans for
the randomized controlled trial. The meta-analysis limit is a screening and follow-up. This was a randomized clinical trial:
good one for this search because the meta-analysis combines 182,000 men between 55 and 69 years old were recruited. Screen-
the results of a number of randomized controlled trials on the ing included PSA testing at 4-year intervals without a DRE.
topic, analyzing them as if one large study had been conducted. Indications for biopsy differed considerably between countries,
Another good limit is ‘‘practice guideline’ because that is a ranging from a PSA 93.0 ng/mL for most countries to PSA
statement by a body of experts on how a condition should be 910.0 ng/mL in Belgium.
approached or managed. In this instance, the student should con- Randomization is a key issue in clinical trials. The pur-
duct searches using both of these limits. When taking an evidence- pose is to eliminate selection bias, in other words, to make
based approach to searching the literature, the clinician must the treated and comparison groups as similar as possible ex-
explore a variety of options. This could be viewed as a ques- cept for treatment. It is common to sort patients by important
factors before randomization. This is called stratification. In
addition, within strata, a technique called blocking may be
Table. PICO model for a 50-year-old man interested used to ensure an equal balance of treatment assignments
in screening for prostate cancer throughout the recruitment phase of the study. The ERSPC
study stratified on country before randomization but otherwise
PICO MeSH terms did not use stratification or blocking. The PLCO study used
Population/ 50-year-old man who Prostatic neoplasms
stratification by center and age and randomized by blocks. An
problem would like to be screened interesting twist on randomization was used in three of the
for prostate cancer ERSPC countries. In these three countries, randomization was
Intervention Screening Prevention and control done before obtaining informed consent, counter to what is usu-
as a subheading
ally done. This and other factors resulted in slightly uneven
Comparison None No search terms
intervention (if any) needed
balance of screened/controlled patients in Finland. Also, it was
Outcome(s) Early detection; decline in No search terms found that compliance was better in those men randomized
mortality needed after consent. Both studies were designed with adequate statis-
MeSH indicates medical subject headings; PICO, patient, intervention, com- tical power to detect meaningful differences in prostate cancer
parison(s), and outcome(s). death rates, with adjustment for anticipated compliance rates.

Review Article

Analysis in both studies was based on ‘‘intent to screen’’V Summary

the analysis was performed as if all of the patients complied It is our perspective that team teaching of computerized
with the protocol to which they were randomly assigned. This searching of the medical literature may be the most successful
is analogous to the ‘‘intent to treat’’ principle for treatment approach, yet we fully realize that learners must become able
trials. This valid EBM principle may be important to the dif- to independently find and critically appraise studies because
ferences between the two studies. Analytical techniques were they will not always have access to librarians and statisticians
typical for time-to-event data and both studies used similar for assistance in the practice of medicine.13Y16 When teach-
methods. Both studies had multiple interim analyses, but only ing EBM concepts, it is important to begin with the necessary
the ERSPC manuscript reported results with and without ad- critical appraisal tools that will enable students to evaluate any
justment for multiple testing. study type without the guidance of a professor, librarian, or
In the PLCO study, the data and safety monitoring board statistician. The tools are as follows:
(DSMB) stopped the study after only 7 years of complete
Assess the patient
follow-up (some patients had as long as 10 years). No statis-
& Take the patient’s history, perform the physical examination,
tical futility measures had been reached, but the DSMB and so forth. Check your assessment with the attending physician
reported a continuing lack of a significant difference in the before proceeding to the next step.
death rates between screened and unscreened groups and there Ask a clearly formulated clinical question
was information suggesting harm from screening. The manu- & Based on your assessment of the patient’s condition, use the PICO
script did not present any data about harms attributable to model (problem, intervention, comparison intervention, outcome)
treatment, overdiagnosis, or overtreatment. to write a clearly focused question. Use a tool such as PubMed for
Handhelds to guide you through the PICO search.
Apparently the DSMB concerns about harm from screen-
ing did not come from the PLCO study but from other sources. Acquire the best available evidence
& Determine which databases include the study types that will help
None of the referenced studies of harm from screening were as- you answer the clinical questionVa point-of-care tool such as
sociated with the PLCO study. The ERSPC study was not stopped UpToDate or DynaMed or the medical literature (Cochrane Data-
early and reached a median follow-up time of 8.8 years in the base of Systematic Reviews, PubMed) may be sufficient; however,
screening group and 9 years in the control group. PSA-based further research into the patient’s health issues may be needed.
screening reduced the rate of death from prostate cancer by Appraise the evidence
20%. More than 17,000 prostate biopsies were performed in & Choose the critical appraisal form appropriate for the study type,
and use it to guide you through the critical appraisal process. Forms
the 73,000 men who participated. No information was provided are freely available through the Duke University Medical Center
for the number of such procedures in the control population, but Library’s Web site (
it is likely that many more biopsies were done in the experimen- searching/ebmresources.pdf).
tal group. Although more diagnoses were made, there is a risk Apply the evidence
associated with biopsies and overdiagnosis from screening.10 & Use your clinical judgment to decide whether the best available
Attempting to answer the clinical question posed earlier evidence can be applied to the patient or population in your
in the article is not simple. It is striking that the two large ran- clinical setting.
domized clinical trials came to different conclusions concern- This review has focused on the process. Not every database
ing the primary research question. The PLCO study concluded is applicable to all situations, although the process is the same
that prostate cancer mortality was not reduced by screening. across databases and disciplines. For example, if a database has
The ERSPC study concluded that there was a significant re- a controlled vocabulary, then it is generally best to use it for
duction in prostate cancer mortality from screening. Reasons more targeted results. In this context, whether one is searching
for the disparate conclusions are complex, but it is likely that for information on a clinical medical question or a legal question,
issues of patient selection and protocol adherence were among the process is the same. In the legal world, the client’s situa-
the most influential. tion would be assessed, a clearly focused legal question would
In the end, the prostate cancer mortality rate has been de- be asked, the best case law would be acquired using a com-
clining in the United States since the mid-1990s (rates have puterized search in Westlaw or Lexis, evaluate the precedents
fallen approximately 4 %/year since 1992). This date coincides set by the case law found, and apply it to the client’s situation.
with the initiation of widescale PSA screening, implying that Parallels are easily visible in the practice of clinical medicine.
the use of this test has played a role in this decline. Obviously, one must choose the appropriate database for the
The American Cancer Society guidelines now recommend question asked.
that men 50 or older with at least a 10-year life expectancy It takes considerable skill in searching to identify the most
should have an opportunity to make an informed decision about relevant articles. It then takes an accomplished physician to use
prostate screening after receiving information about the risks, his or her clinical, epidemiologic, and statistical skills to fully
benefits, and uncertainties.11 Even so, with these two reports, it understand the available data. We recognize that the clinician
remains uncertain whether the decline in mortality is the result often can answer many questions regarding his or her patient
of increased screening or other factors.12 using a computerized search of the literature, but this process

Critical Appraisal Process: Step-by-Step

