Sie sind auf Seite 1von 46

Transforming Multiple Choice Questions to

Effectively Assess Application of Knowledge

STReME Series, August 11, 2011


Brenda Roman, MD, Professor of Psychiatry, BSOM
Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOM
Journey through Lunch

• Power and Purposes of Assessment


• Learning Approaches and Assessment
• Assessment Using Multiple-Choice Questions (MCQs)
• Evaluation of MCQ Quality
• Identification of Flaws in MCQs
• Practice: Find the Flaws
• Practice: Choose the Highest-Quality MCQ
Q1: Of the criteria listed below, which one do you believe is most
important for judging the quality of a multiple choice question (MCQ)?

a. The MCQ assesses knowledge that is considered important by


the writer of the question.
b. The MCQ is directly related to one or more of the course’s
learning objectives.
c. The MCQ asks the student to make a decision that is based on
critical interpretation of data.
d. The MCQ requires the student to appropriately apply
knowledge, not just to recall facts.
Flaws in the previous MCQ

• Options • Q1: Of the criteria listed below,


– Non-homogeneous options: which one do you believe is most
(a) (b) about content; important for judging the quality
(c) (d) about format and of a multiple choice question?
purpose
– Unnecessarily long
a. The MCQ assesses knowledge
– Only (d) has a contrasting considered important by the
clause
writer of the question.
• Stem
– question can’t be answered
b. The MCQ is directly related to one
if the answer options are or more of the course’s learning
covered up objectives.
– “judging the quality” c. The MCQ asks the student to make
which aspect of quality? a decision that is based on critical
– “do you believe” implies that interpretation of data.
the best answer is a matter of d. The MCQ requires the student to
personal opinion (there is no appropriately apply knowledge,
single best answer) not just to recall facts.
Power of Assessment
• “Assessment drives student learning. Student
assessment can be designed to foster the
development of elaborated knowledge
structure by making relationships and
understanding—rather than isolated facts—
the objects of assessment.”

Bordage G: Elaborated Knowledge: A Key to Successful


Diagnostic Thinking. Acad Med 69:883-885, 1994
Purposes of Assessment (using written questions)

• Assumption: performance on a sample of questions


allows inferences about the skills of examinees in a
broader domain
• Communicate what instructor views as important
• Motivate students to learn
• Allow objective comparisons among students who
often experience variations in curriculum
• Compensate for instructional gaps by encouraging
students to read broadly and utilize a variety of
educational tools

Case SM, Swanson DB; Constructing Written Test Questions


for the Basic and Clinical Sciences, 3rd edition, NBME 2002
Assumption Refuted:

Physicians who pass licensure


exams may lack some essential
skills for practicing medicine
Learning Behavior
• Learning behavior: “. . .the set of cognitive and metacognitive
processes that learners draw on to acquire knowledge, skills,
and understanding” (Mitchell R; Acad Med 84:918-926, 2009)
• 424 residents from 7 IM residencies completed a cognitive
behavior survey (140 items, 7 point Likert scale)
• Seven learning behavior scales developed from survey data:
memorization, conceptualization, reflection, independent
learning, critical thinking, meaningful learning experience,
attitude toward educational experience
• RESULTS
– Memorization not correlated positively with other 6 scales
– Memorization correlated negatively with critical thinking
– Residents in top 20% on reflection scale also conceptualized, learned
independently, and thought critically more than the bottom 20%
Competent Physicians
• Integrate: “to bring together parts into a whole” (Webster’s)

Knowledge of
biomedical and
clinical science

Cultural and Skills of


ethical Competent analytical and
competence critical thinking
Physician

Ability to
communicate
Assessment in Medical Education
• Primary purpose: measure student’s competence in
course, clerkship, or residency
• Secondary purpose: develop competent physicians
– Motivate student to integrate new knowledge with
previously mastered knowledge (longitudinal learning)
– Foster critical thinking skills (clinical decision-making)
– Impart direction for future learning (subliminal
messages embedded in assessments)
Learning Approaches and Assessment
• Students adapt learning approaches to context in which
learning occurs
• Three basic approaches identified
– Surface (memorization)
– Deep (comprehension and application)
– Strategic (adapted to meet perceived expectation of faculty)
• Teaching methods influence students’ approach to learning
• Some teaching methods hinder development of deep
learning approach
• Education of competent physicians requires “substantial
changes in teaching, curriculum and, particularly,
assessment . . .”

Newble DI, Entwistle NJ: Learning Styles and Approaches: Implications for Medical
Education. Medical Education 1986; 20:162-175)
Can MCQs assess learner’s ability to apply
knowledge by critical thinking and
problem solving?
Authors Method Results and Conclusions

Corderre etal, Think-aloud protocols to determine 1. Similar clinical reasoning


BMC Mdical problem-solving strategy used by skills used to answer 5-
Education 2004, gastroenterologists and MS4s in option and extended
4:23 answering 8 questions about dysphagia, matching MCQs
nausea/vomiting, diarrhea, and 2. Stem more important than
elevated liver enzymes options for testing clinical
reasoning
Beullens etal, 20 final year med students & 20 final 1. Residents & upper 50% in
Medical year IM residents solved extended both groups used more
Education 2005, matching questions (EMQs) aloud. “forward” than
39:410-417 “backward” reasoning
2. Processes of clinical
reasoning can be assessed
using EMQs.
Cuddy etal, 27 experts complete survey about 92% questions clinically
Acad Med 2004, clinical relevance of 150 NBME step 2 relevant; 85% of content used
79:S43-45 MCQs in clinical practice
*

* Bloom’s taxonomy of cognitive learning collapsed into 3 levels:


(1) knowledge; (2) comprehension and application; (3) problem solving
MCQs using clinical vignettes in the stem
• “Questions with rich descriptions of clinical
context invite the more complex cognitive
processes that are characteristic of clinical
practice.”
• “Conversely, context-poor questions can test
basic factual knowledge but not its
transferability to real clinical problems.”

Epstein RJ: Assessment in Medical Education,


New England Journal of Medicine 2007; 356:387-396.
“There is nothing new under the
sun”
(Ecclesiastes 1:9)
• “No teaching should be
done without a patient
for a text.” (Osler William:
On the Need of A Radical
Reform in our Methods of
Teaching Medical
Students; Medical News
82:49-53, 1904.)
• NBME announcement
2010-2011: decision to
use only clinical or
experimental vignette
formats on USMLE step 1.
Format of Clinical Vignette
• Outline (not all parts necessary)
– Age and gender (“42-year-old woman”)
– Site of care (“comes to the emergency department”)
– Presenting complaint (“because of headache”)
– Duration (“has persisted for 2 days”)
– Past history (may not be relevant)
– Physical findings (“pulsating artery anterior to ear”)
– +/- diagnostic studies; +/- treatments
• Example
– “What area is supplied with blood by the posterior inferior cerebellar
artery?
– “A 62-year-old man develops left-sided limb ataxia, Horner’s syndrome,
nystagmus, and loss of appreciation of facial pain and temperature
sensations. Which of the following arteries is most likely to be occluded?”
How good is this MCQ?
• Subjective methods to evaluate quality
– Opinion of question author
– Opinions of other content experts
– Opinions of experienced MCQ writers
– Opinions of students (pre-test, post-test)
• Systematic identification of flaws by question
author and trusted consultants
(YOU ARE THE CONSULTANTS!)
• Gold standard: performance of MCQ in an exam,
as demonstrated by difficulty index and
discrimination factor
Gold Standard: Performance of MCQ on an examination

Year N diff. index top 25% bottom 25% disc.factor answer A B C D E

Difficulty index: Discrimination Factor: how well the item


percentage of discriminates between students who
examinees who performed highest on the exam (top 25%)
answered the and students who performed lowest on the
question correctly exam (bottom 25%).

Higher D.F. suggests item is a more reliable


measure of competence
Systematic Identification of
Flaws in MCQs

A) 5 common flaws in stems


B) 7 common flaws in answer options
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A1. Stem does not end with a question (lead-in) that can be
answered by covering up answer options.

A 39-year-old female is seen for an annual exam. She had


been on oral contraceptive pills as a teenager but
discontinued that form of contraception over 15 years ago.
Because of her contraceptive practice she has . . .

According to the best scientific evidence


available to date, HIV-1 came from . . .

Prostate cancer is best treated . . .

Corticosteroid therapy . . .
Systematic Identification of Flaws Pre-Exam: MCQ Stems
A2. Stem is unnecessarily complicated—too long, lots of irrelevant
information.
A 48-year-old woman presents to the physician with lower back
pain. She states that she has had the pain for about 2 weeks and
that it has become steadily more severe. An x-ray film shows a lytic
bone lesion in her lumbar spine. Review of systems reveals the
recent onset of mild headaches, nausea, and weakness. Her CBC
shows a normocytic anemia, and her erythrocyte sedimentation rate
is elevated. Urinalysis shows heavy proteinuria, and a serum protein
electrophoresis shows a monoclonal peak of IgG. Which of the
following is responsible for this patient’s spinal lesioins?
a.Bence-Jones protein
b.lymphoplasmacytoid proliferation
c.osteoblast activating factor
d.osteoclast activating factor
e.primary amyloidosis
Systematic Identification of Flaws Pre-Exam: MCQ Stems

A3. Stem contains vague terms that invite


a wide range of interpretations.

A B-cell-deficient toddler recovers as well as a normal child does


to infection with the chickenpox virus. This child's immune
system is capable of developing . . .
Systematic Identification of Flaws Pre-Exam: MCQ Stems

A4. Stem contains abbreviations that are not clearly understood


by all examinees.

A 32yo WF in her 1st trimester of pregnancy experiences GERD


3-4x/week and c/o heartburn. She has not responded to MOM.
Which medication will be best to treat this patient?
Systematic Identification of Flaws Pre-Exam: MCQ Stems

A5. Stem contains words about quantity that are difficult or


impossible to quantify: probably, usually, infrequently,
sometimes, in most cases, in few cases, etc.

In most cases, men who develop prostate cancer usually


have limited dietary intake of which of the following food
groups?
Perception is unpredictable
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B1. One or more options do not follow grammatically from the stem.

Which of the following behaviors is most frequently


observed in adolescents who smoke cigarettes?

a.intelligence quotient below 80


b.overeating
c.body mass index < 25
d.disrespect for authority
e.alcohol abuse
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B2. Options are heterogeneous in language or domains.

Which is necessary for the development of Burkitt lymphoma?

a.creation, by translocation, of a bcr/abl fusion gene in B-


lymphocytes
b.deletion of p53 tumor suppressor gene in B-lymphocytes
c.infection of B-lymphocytes by Epstein-Barr virus
d. over-expression of the c-myc oncogene in B-lymphocytes
e.trisomy of chromosome 8
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B3. Option includes absolute terms that make it unlikely to
be correct: “always”, “never”

In patients with advanced dementia due to Alzheimer disease, the


memory defect

a.can be treated adequately with phosphatidylcholine (lecithin).


b.could be a sequela of early parkinsonism.
c.is never seen in patients with neurofibrillary tangles in the
cerebral cortex.
d.is never severe.
e.possibly involves the cholinergic system.
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B4. Correct option is longer, more specific, or more complete than
other options (“sore thumb”).

Secondary gain is

a.synonymous with malingering.


b.a frequent problem in obsessive-compulsive disorder.
c. a complication of a variety of illnesses and tends to prolong
many of them.
d.never seen in organic brain damage.
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B5. correct option contains the most elements in common with
other options (“convergence”).

Intramedullary destruction of red blood cells in beta-thalassemia


is best explained by which mechanism?

a.beta-4 tetramer oxidation and precipitation


b.excessive iron accumulation in macrophages
c.increased formation of alpha chain aggregates
d.increased formation of Hb H (beta 4)
e.increased formation of Hb F (alpha 2 gamma 2)
Systematic Identification of Flaws Pre-Exam:
MCQ Answer Options
B6. Options are long, complicated, or composed of 2-3 parts,
imposing irrelevant difficulty.

The figure below shows the dose-response curves for four different
derivatives of a muscarinic receptor agonist. Each derivative acts by
binding to the same site on the muscarinic receptor. The Heptyl
derivative

a.has a lower binding affinity for the receptor than does the Hexyl
derivative.
b.has a lower intrinsic activity than does the Hexyl derivative
because it has a lower receptor affinity.
c.is a full agonist when compared with the Octyl derivative.
d.is more potent than the Hexyl derivative.
e.may act as a mixed agonist-antagonist if it has a higher receptor
affinity than the Hexyl derivative.
Systematic Identification of Flaws Pre-Exam: MCQ
Answer Options
B7. Options contain words about quantity that are difficult or
impossible to quantify: probably, usually, infrequently, sometimes,
in most cases, in few cases, etc.

Severe obesity in early adolescence

a. usually responds dramatically to dietary regimens.


b. often is related to endocrine disorders.
c. has a 75% chance of resolving spontaneously.
d. shows a poor prognosis.
e. usually responds to pharmacotherapy and intensive psychotherapy.
Systematic Identification of Flaws
Pre-Exam: MCQ Answer Options
B8. “none of the above” or “all of the above” is used as an option.

Which of the following cities is closest to New York City?

a. Boston
b.Chicago
c.Dallas
d.Los Angeles
e.None of the above
Identify those flaws: Practice MCQ 1
P1) Which of the following applies to pseudogout?

a.It occurs frequently in women.


b.It is seldom associated with acute pain in a joint.
c.It may be associated with a finding of chondrocalcinosis.
d.It is clearly hereditary in most cases.
e.It responds well to treatment with allopurinol.

P1) Of 13 flaws listed in your worksheet, how many flaws are present
in this MCQ?

a.1
b.2
c.3
d.4
e.5
Identify those flaws: Practice MCQ 2
P2) A 17-year-old male presents with a two-year history of "severe" acne. He has
previously been treated with numerous topical treatments and several different
oral antibiotics. Multiple nodules and cysts are present diffusely on the face,
shoulders, back, and upper chest. He has multiple depressed scars on the cheeks.
He is administered an oral agent which leads to significant improvement in his
condition. This agent works by

a.disruption of bacterial cell membranes.


b.exfoliation.
c.increased sebum production.
d.reduction of androgen levels.
e.suppression of sebum production.

P2) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?

a.1
b.2
c.3
d.4
e.5
Identify those flaws: Practice MCQ 3
P3) A 25-year-old woman consults her physician because she has decided to use
oral contraceptives. After the physician asks about history of thrombophlebitis,
pulmonary embolus, and smoking (all negative), he proceeds to physical exam:
Vital signs: within normal limits Height 4'0" Weight 85 lbs. HEENT: large head
with prominent, rounded forehead Heart, Lungs, Abdomen: within normal limits
Extremities: short arms and legs (compared to trunk length). He writes a
prescription for oral contraceptives, but also records her most likely physical
diagnosis in the chart. Which molecular abnormality best explains her diagnosis?
a.constitutive activation of fibroblast growth receptor 2
b.constitutive activation of fibroblast growth receptor 3
c.expansion mutation in HOXD13 with altered length of transcription factor
d.mutation in COL1A1 with deficient synthesis of type 1 collagen
e.mutation in COL2A1 with deficient synthesis of type 2 collagen

P3) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?
a.1
b.2
c.3
d.4
e.5
High-Quality MCQ: in principle
A high-quality multiple-choice question is one that
assesses content considered to be important, is free of
flaws in both stem and options, and effectively identifies
those who can use their knowledge to skillfully assess data
and make decisions.”

(modified from Case SM Swanson DB: Constructing Written Test Questions for the
Basic and Clinical Sciences, National Board of Medical Examiners, 2002)
Statistical Definition of High-Quality MCQs:
ones that perform well on an exam, as judged
by difficulty index and discrimination factor

Year N diff. index top 25% bottom 25% disc.factor answer A B C D


E

Discrimination Factor: how well the item


discriminates between students who
performed highest on the exam (top 25%)
Difficulty index: and students who performed lowest on the
percentage of exam (bottom 25%).
examinees who
answered the Higher DF suggests item is a more reliable
question correctly measure of competence.
Mastery MCQs
The data below show performance of 3 MCQs used in a final course
exam for BSOM year 2 students.
 All three assessed the same content domain.
 All three were classified as “mastery” questions
(answered correctly by ≥ 90% of students)

QM1) Based on the performance data shown below, which one is the
highest-quality MCQ?

Option n D.I. top 25% bottom D.F. A B C D E


25%
A) 101 90 96 86 0.27 0 10 91 0 0

B) 105 90 96 67 0.41 1 95 3 4 2

C) 105 90 93 81 0.22 2 7 94 1 1
Intermediate Difficulty MCQs
The data below show performance of 4 MCQs used in a final
course exam for BSOM year 2 students.
All four assessed the same content domain.
All four were classified as “intermediate difficulty” questions.
(answered correctly by 70.0 – 89.9% of students)

QM2) Based on the performance data shown below, which one is


the highest-quality MCQ?
Option n D.I. top 25% bottom D.F. A B C D E
25%
A) 105 81 100 59 0.43 0 19 85 1 0

B) 93 81 96 63 0.31 14 4 75 0 0

C) 93 70 96 54 0.40 13 65 10 1 4

D) 93 75 92 67 0.21 1 1 19 2 70
Challenging MCQs
The data below show performance of 3 MCQs used in a final
course exam for BSOM year 2 students.
 All 3 assessed the same content domain.
 All 3 were classified as “challenging” questions.
(answered correctly by <70 % of students)

QM3) Based on the performance data shown below, which one is


the highest-quality MCQ?

Option n D.I. top 25% bottom D.F. A B C D E


25%
A) 104 64 74 41 0.32 12 67 6 12 7

B) 93 57 84 42 0.26 0 22 17 1 53

C) 101 69 81 55 0.33 2 0 3 70 26

Das könnte Ihnen auch gefallen