Sie sind auf Seite 1von 79

Constructing Test Items

Lecture delivered by Mr Stephenson Grayson

At UWI, St Augustine
August 19, 2014

Objective
1. To help teachers develop competence
in the basic principles and practices of
good item writing

The Language of Assessment


Testing makes use of a set of test items, usually put together according to
specifications. Each test item consists of a well-defined task that requires a
response from which it may be inferred that a certain skill (or knowledge or
ability or attitude) does or does not exist within the testee.
Assessment is a systematic process of collecting information by
teachers about their students
teachers about their teaching
students about themselves
Evaluation is the process of making value judgments based on information
collected systematically through the assessment process and also estimating
the quality of an object.

PLANNING ASSESSMENTS
1. Determine the purpose of the test(s):
placement/formative/diagnostic/summative
2. Develop the test specifications:
instructional objectives vs content area
3. Select the appropriate item types:
objective items and or essay items
4. Prepare the relevant test items:
both types have limitations and advantages

Item Types
A variety of item types can and should be
used for accurate, valid, effective written
assessment.
This lecture focuses on the three item types
which are most widely used by teachers :
multiple choice, restricted-response and
extended essay questions.
You need to have your items reviewed by
trained colleagues in order to improve the
quality of the items.

Multiple-Choice Questions
A multiple - choice question consists of the
STEM (statement/question/stimulus)
OPTIONS (e.g. four possible

responses)
KEY (one correct response)
DISTRACTORS (three incorrect
responses)

Types of Multiple - Choice Questions

These include:
Direct questions -Correct answer
-Best answer
Incomplete statements
Multiple response variety
Combined response variety
Matching variety

Direct Questions: Correct Answer


Variety
Which of the following countries is an
island?
A. Belize
B. Guyana
C. St Lucia*
D. Suriname

Direct Questions: CAV


What is the answer to 12 x 12?
A. 121
B. 144*
C. 212
D. 1212

Direct Questions: Best Answer


Variety
What is the basic purpose of the CSME?
A. To expand the economic space in
the Caribbean*
B. To facilitate movement of artists and
entertainers
C. To expedite a Caribbean political union
D. To introduce a common currency

Incomplete Statement
Cheese is a normal good because
A. it is bought by many persons
B. less of it is bought when its price
rises
C. its price elasticity is negative
D. demand for it increases when
income increases*

Incomplete Statement
An essential feature of a capital good
is that it
A is consumed directly by the
consumer
B* is used to produce other goods
C must be purchased through a loan
D must be replaced frequently

Incomplete Statement
If x/4 + x/16 = 10, then x equals
A. 16
B. 24
C. 32*
D. 48

Multiple Response Variety


Who among the following persons are considered to be
unemployed?
I. Steve, who is not looking for a job
II. Mary, a part-time student, who is looking for a job
III. Harry, who is out of work until the next crop
season begins
A. I and II only
B. I and Ill only
C. II and III only*
D. I, II and III

Combined Response Variety


In what order should these sentences be written in
order to form a coherent paragraph?
I Suddenly he realized that he was wearing no trousers.
II It was early one bright, sunny Sunday morning.
III Before he had reached very far he saw that everyone
was pointing at him and laughing.
IV The day was so beautiful that Simon decided to jog to
the centre of town before breakfast.
(A)
I, II, III, IV
(B)
I, III, IV, II
(C)* II, IV, III, I
(D)
IV, II, III, I

Matching Variety
Items 1 3 refer to the following, which are all elements of an
electrical circuit.
(A) Resistor
(B) Transformer
(C) Condenser (capacitor)
(D) Storage battery
In answering Items 1 3 match each item with one of the
options above. You may choose any of the options more than
once, once, or not at all.
1.
2.
3.

Requires an alternating magnetic field for its operation


Turns chemical potential energy into electrical energy
Changes electrical power from high to low voltage

Advantages of Multiple-Choice
Questions
Enable sampling of a wide content area
within a given testing time
Flexible several questions may be based
on the same stimulus
Versatile as objectives can be measured
across taxonomic range
Pre - testing of items allows for prediction
and control of test difficulty and reliability

Advantages of Multiple-Choice
Questions contd
Writing time is eliminated and time used
for thinking instead
Scoring is easy and reliable as keys are
determined prior to scoring
Item analysis provides the opportunity to
examine candidates responses to each
item and so identify areas of weakness

Item-Writing Techniques - Guidelines for


Writing the STEM
Present a single, clearly formulated
question, statement or scenario
Use a simple and direct sentence structure
Avoid ambiguous words or phrases
Avoid excessive verbiage
State the stem in positive form wherever
possible and never use double negatives*
If negative words (not, least, etc) are used
they should be capitalized*

Example
Poor: All of the following are NOT examples of direct
taxation EXCEPT
A. value added tax
B. income tax*
C. import duties
D. excise duties
Better:Which of the following is an example of direct
taxation?
A. Value added tax
B. Income tax*
C. Import duties
D. Excise duties

Guidelines for Writing the


STEM contd
Avoid use of personal pronouns (you, we)
and especially gender-specific pronouns
(he, she)
Include any words in the stem that would
have to be repeated in each option*
Word the stem so that it will lead to only
one CORRECT or BEST answer

Example
POOR:
If a retailer returns goods to a supplier, how should he
record this in his books?
I. He should enter transaction in Returns Inwards Book.
II. He should enter transaction in Returns Outward Book.
III. He should post to debit side of Suppliers Account.
IV. He should post to credit side of Suppliers Account.
A.
B.
C.
D.

I and II only
I and IV only
II and III only
II and IV only

Example

BETTER:

If a retailer returns goods to a supplier, how should


this be recorded ?
I.
II.
III.
IV.

Enter transaction in Returns Inwards Book


Enter transaction in Returns Outward Book
Post to debit side of Suppliers Account
Post to credit side of Suppliers Account

A.
B.
C.
D.

I and II only
I and IV only
II and III only
II and IV only

Guidelines for Writing the STEM contd


Provide as much of the information as
possible in the stem*
Avoid phrases like what would you do? or
what do you think?
Avoid situations/content that might be
outdated at the time of the exam
Ensure that the stem does not contain any
clues to the correct answer

Example
POOR
What action should be taken by a trader who sent
an invoice of $68 instead of $65?
A. The trader must send the company a
promissory note.
B. A letter must be sent to the company advising
of the error.
C. The trader must send a debit note to the
company.
D. The company must credit the trader with the
difference.

Example
BETTER:
A trader sent Mr Jones an invoice for $68
instead of $65. Which note should the
trader subsequently send to Mr Jones?
A. Promissory note
B. Advice note
C. Debit note
D. Credit note

Guidelines for Writing the


OPTIONS
Options must follow logically and
grammatically from the stem
They must be arranged in a logical,
chronological or systematic order
They must be parallel in:

length*
grammatical structure
content & terminology*

Example
POOR
Which of the following is the BEST
definition of money?
A. Coins minted by the country
B. Fiat money printed by the government
C. Any commodity widely used as a means
of payment for goods and services*
D. Current account deposits

Example

A.
B.
C.
D.

BETTER
Which of the following is the BEST
definition of money?
All coins that are minted by the country
Fiat money printed by the government
Any item widely used as a means of
exchange
Current account deposits in a bank

Guidelines for Writing the OPTIONS


contd
Use professionally acceptable or wellknown technical terms in options
(distractors as well as the key)
Avoid use of ALL OF THE ABOVE &
NONE OF THE ABOVE
Include ONLY ONE correct (or best)
option (i.e., the key)

Guidelines for Writing the


DISTRACTORS
Distractors must be plausible
Should be parallel with the correct answer
in length, structure, vocabulary
Should provide no unintended clues to the
correct answer
Should not be too similar in meaning to
the correct answer
Should avoid unnecessarily technical
language*

Example
POOR
Which of the following is a statement of the
Yerkes-Dodson Law?
A. The rate of acquisition of habit strength is a
nonlinear decreasing function of delay of
reinforcement.
B. The relation between intensity of noxious
stimulation and acquisition of habit strength is
U-shaped.
C. The relation between stimulus intensity and
maximum mean physiologic response is
positive and linear.

Example
BETTER
Which of the following is a statement of the
Yerkes-Dodson Law?
A. Learning occurs more slowly when rewards
are delayed than when they are immediate.
B. Learning is greatest when pain or stress is
intermediate rather than absent or intense.
C. The more intense a stimulus, the greater the
bodily response.

Guidelines for Writing the


DISTRACTORS [contd]
Distractors should not overlap with each
other
Should avoid humorous/ ridiculous
responses
Should help to discriminate between
examinees who have mastered the
knowledge and those who have not

Guidelines for Writing the


DISTRACTORS [contd]
DO
Use common misconceptions and common
errors made by students in class
Use statements that are

true but do not satisfy the requirements of the


problem
too broad or too narrow for the requirements of
the problem
carefully worded but incorrect, and may seem
plausible to the uninformed

Guidelines for Writing the


DISTRACTORS [contd]
DONT
Make the distractors vague or ambiguous
while key is clear and concise *
Use absolutes (no, all, never, always) in
distractors but not in the key

Can you spot the flaw in this


question?

A.
B.
C.
D.

POOR
Which of the following is LEAST likely to
result from growth in an organization?
Increased capital investment
More division of labour
Greater rapport
Increased communication

Example

A.
B.
C.
D.

BETTER:
Which of the following is LEAST likely to
result from growth in an organization?
Increased capital investment
More division of labour
Greater productivity
Increased communication*

Guidelines for Writing the KEY


Key should be either the ONLY correct answer
or clearly the BEST or MOST APPROPRIATE
answer to the informed, well-prepared candidate
Key should be correct and defensible on
professional and (if necessary) legal grounds
Avoid repeating key words already used in the
stem
Ensure that key is not obvious because of its
length, grammar or content

Guidelines for Writing the KEY contd


A GOOD KEY . . .
Is correct and defensible on professional and (if
necessary) legal grounds
Is either the ONLY correct answer or is clearly the
BEST or MOST APPROPRIATE answer to the
informed, well prepared candidate
Avoids repeating key words already used in the stem

Guidelines for Writing the KEY contd


Poor Example:
Twenty Thousand Leagues under the Sea is considered
to be
(A)
(B)
(C)
(D)

an adventure story
a science-fiction story
a historical novel
an autobiography

This item may have more than one correct answer.


When the book was written it was probably viewed as
science fiction, but it is also an adventure story.

Guidelines for Writing the KEY contd


Better Example:
Twenty Thousand Leagues under the Sea is considered to
be

(A)
(B)
(C)
(D)

an adventure story
a tragedy
a historical novel
an autobiography

A major task in writing multiple choice items is


reducing the ability of uninformed candidates to guess
correctly by providing NO CLUES to the correct
answer.

Testing a Range of Cognitive Skills

Can MC
questions
really test a
range of
cognitive
skills?

Evaluation
Synthesis
Analysis
Application
Comprehension
Knowledge

Cognitive Skills
Evaluation making judgements
Synthesis putting elements together
Analysis identifying parts and their
relationships with each other
Application applying principles or
generalizations
Comprehension interpreting information
Knowledge recalling facts, concepts,
theories

Levels of Cognitive Skills Tested by MC


Questions
Knowledge
recalling facts, concepts; defining terms,
identifying characteristics
Define, select, label, list, name, state
Comprehension
describing principles; interpreting information
Convert, describe, rewrite, estimate,
explain

Levels of Cognitive Skills Tested by


MC Questions
Application
applying learned principles to new
situations

Show, apply, predict, use, calculate, solve


Analysis
identifying parts and their relationships with
each other

Differentiate, distinguish, infer, compare,


illustrate, contrast

Levels of Cognitive Skills Tested by MC


Questions
Synthesis
putting elements together to form a whole

Compile, create, summarize, compose,


design, rearrange
Evaluation

making judgements

Appraise, compare, justify, contrast,


conclude, assess, support

Knowledge/Comprehension
Items
Which of the following is the correct
definition of depreciation?
The MAJOR factor which led to the 1837
riots was ..
Which of the following is a direct factory
expense?
Which of the following nutrients are added
to flour to supplement its nutritive value?
When a Trial Balance does NOT agree,
the difference is placed in a..

Application/Analysis Items
Calculate the companys net profit from the
following information.
Which of the following BEST explains the
term
. The MOST significant economic
impact of this action would be.
The manager of a company decides to
This action reflects the principle that
Which of the following is the reason for

App / Ana / Syn / Eval Items

Written or visual stimulus material


presenting a situation that the candidate
may encounter in the real world
followed by a series of questions
designed to tap the candidates ability to
apply theories and principles, analyse
information, draw conclusions and make
judgements

Application/Analysis Items

A.
B.
C.
D.

<<Brief (2 or 3 sentences) description of


a scenario>>
In the situation above, the owner of a
small business/ an accountant/ a
consumer will MOST likely
<do X> because <Y>

Writing Higher-Order Questions

Provide data in a table, chart, graph, or a


situation/scenario and require that
candidates analyse or evaluate
The MOST likely conclusion, based on
the data above is that:
In the situation described above, the MOST
appropriate action for the employer to take is to:

THE EFFECTS OF GUESSING


The greater the ability of the uninformed
candidate to guess correctly, the less valid
and reliable the examination is.
The less valid and reliable the examination
is, the less value is placed on the
certification.

THE EFFECTS OF GUESSING contd


WHAT CLUES HELP CANDIDATES TO
GUESS CORRECTLY?
The type of format chosen for the question
The format of the question asked (STEM)
The way the correct answer (KEY) is written
The way the incorrect answers
(DISTRACTORS) are written

THE EFFECTS OF GUESSING contd


UNINTENDED CLUES
DISTRACTORS

Unqualified statements

Negative statements

Specific instances

Means to an end

Layman language
A particular frame of reference

KEYS
Qualified statements
Positive statements
General rule
End product
Technical definition
Another frame of reference

THE EFFECTS OF GUESSING contd


Natural selection, as described in Darwins theory of
evolution, assumed
(A) a non changing population of organisms
(B) changes from generation to generation based upon
mutation
(C)* environmental stimuli that resulted in changes of
body structure in successive generations of offspring
(D) differential survival value of random differences in
offspring
Note that the key (C) is stated in terms that are more
careful and qualified than the distractors.

THE EFFECTS OF GUESSING contd


Which of the following is NOT likely to result from
slash and burn agriculture on step slopes?
(A)
(B)
(C)*
(D)

Erosion of the slopes


Flooding of nearby lowlands
Deeper, more fertile soil on the slopes
Destruction of natural vegetation

Option (C) stands out because it is the only one


that describes a positive event.

CONSTRUCTED-RESPONSE
QUESTIONS
Essay-type questions may be worded so that
they require very short answers (for example,
a single word or sentence restricted
response [by the way question is phrased,
and through the scope of the area being
assessed] / short answer [predetermined
response] or several pages of writing.

SHORT-ANSWER QUESTIONS
Require student to supply a word, phrase, or sentence in
response to a direct question or incomplete statement.

Are best suited for relatively simple cognitive tasks


Require students to produce rather than select the
correct answer
Facilitate greater content coverage than extended essays
Are more difficult to score than multiple choice
(difficulty increases with length of response)
Are usually objectively scored (right or wrong)

WRITING S.-A. QUESTIONS


Use direct questions/commands rather than incomplete
statements
This enhances clarity, and is more likely to yield one
correct answer
Structure the question so that the elicited response is
concise
Short answers should be short.
Specify the units in which numerical answers are to
expressed

SHORT-ANSWER QUESTIONS
Poor Example:
An animal that walks on two feet is ____________________.
Better Example:
An animal that walks on two feet may be technically classified
as ________________. (a biped)
Poor Example:
The country which shares a border with Haiti is ____________ .
Better Example:
Which country shares a border with Haiti? ________________ .

WRITING SHORT-ANSWER
QUESTIONS [contd]
Place blanks in the margins for direct questions or near the end
for incomplete statements
Blanks placed to early tend to confuse.
Poor Example:
The ________ is the government body which, based on the
U.S. Constitution, must ratify all U.S. treaties with foreign
nations.
Better Example:
The government body which, based on the U.S. Constitution,
must ratify all U.S. treaties with foreign nations is the _____.

WRITING SHORT-ANSWER
QUESTIONS [contd]
For incomplete statements, use one or at most two blanks

Avoid Swiss-cheese items for example;

After a series of major conflicts with natural disasters, in the year ____,
the explorers ____ and ____, accompanied by their ______, discovered
_____.
Make all blanks of equal length for all items

A common flaw in bad items is that the length of spaces suggests


how long the expected response is.
Ensure, however, that sufficient space is provided, to accommodate
variations in handwriting size.

EXAMPLES OF RESTRICTED
RESPONSE QUESTIONS

Your school is mounting an exhibition to feature local foods. Develop a plan for
the exhibition by completing the following tasks:

a. Name THREE one-pot dishes that are popular in your country.


___________________

_______________________ _____________________
(3

marks)
b. Select ONE of the dishes you named at (a) above, and name THREE nutrients
that are found in the food in that dish.
Name of major food: ________________________
Nutrients: _________________, ___________________, _______________
(3 marks)

EXAMPLES OF RESTRICTED
RESPONSE QUESTIONS
c. Give TWO health consequences of having a deficiency of any one of these identified
nutrients.
___________________________________________________________________
___________________________________________________________________
(3 marks)
d. List THREE preparation practices that can be used to ensure that maximum benefits are
derived from any one of the nutrients identified in (b) above.
____________________________________________________________________
____________________________________________________________________
____________________________________________________________________
(6 marks)

ESSAY QUESTIONS
ESSAY QUESTIONS ARE . . .
The most commonly used format in classroom testing
Best used to test ability to select, organize and present
information in a logical, coherent manner
Not as easy to write as is generally thought
Difficult to score reliably
Used to elicit a limited number of responses possible in the
time typically available

WRITING GOOD ESSAY QUESTIONS .


Test important aspects of the learning target
Match questions to test-specifications (for example, in
terms of required performance, emphasis, and number
of points)
Require application of knowledge in novel situation(s)
Focus task to clearly indicate the format and scope
(length and depth) of the required response.

WRITING GOOD ESSAY


QUESTIONS contd
Ensure that the level of complexity is appropriate for the
education maturity of the students
Structure question to elicit more than recall of facts,
ideas or opinions of others.
Word the tasks in a way that leads all students to
interpret the assignment in the intended way
Word tasks to make clear the required purpose of the
response, the time to be spent on it and basis on which
it will evaluated.

ESSAY QUESTIONS
Poor Example:
Analyse the defeat of the British by the Colonials by listing the FOUR factors discussed
in class that led to the defeat.
(20 marks)

Better Example:
(a)

List FOUR factors that led to the Colonial victory over the British in the War
of Independence.
(4 marks)
(b)

For EACH factor listed, write a short explanation of how that factor
helped the Colonist defeat the British.
(4 marks)

Better Example contd


(c) Choose ONE of these factors that, in your opinion,
the British could have changed or overcome.
Explain what actions the British could have taken to
change or overcome this factor.
(4 marks)
(d) What probably would have happened in the war if
the British had taken the actions you stated?
Suggest why this would have happened.
(8 marks)

SCORING ESSAYS
Analytic (point-score) / vs holistic (global)
Analytic - outline the major points
- estimate the marks for each
based on complexity of problem,
emphasis of topic, time needed
to respond
- decide on how to score partials

Analytic Scoring
Most suitable to restricted response type
items and structured questions
High reliability of scores because of
precision of the mark scheme
Can be time consuming

Global (Holistic) Scoring


Most suitable for use when interest is in
overall (global) performance
List the important dimensions of the
responses which are of interest
Describe how these dimensions are to be
identified and how may be described by
candidates
Develop concise descriptions for three or
more levels of global performance.

Global Scoring contd


Indicate performance expected for all the
important dimensions
Levels should be hierarchical. Each level
must contain some quality or quantity
more demanding than the level below
e.g. many errors>few errors>no errors

Example of global scheme


SUPERIORITY
[24-25]
- Demonstrates excellent structure
- Demonstrates excellent organization
- Demonstrates excellent language use
[21-23]
- Suggests very good structure
- Suggests very good organization
- Suggests very good language use but
with occasional lapses in accuracy

Example of global scheme


COMPETENCE
[14-20]

[10-13]

- Demonstrates good structure


- Demonstrates good organization
- Demonstrates effective language use
but with a few lapses
- Suggests inconsistency in structure
- Suggests ability to organize details
- Suggests ability at effective language
use but slight inconsistence
in accuracy

Example of global scheme


INCOMPETENCE
[5-9]
- Suggests an ability to manipulate
structure
- Suggests an ability to organize details
in a logical manner
- Suggests frequent, inaccurate language
use
[0-4]
- Demonstrates total inability to
manipulate structure
- Demonstrates total inability to organize
details
- Demonstrates inaccurate language use

Bibliography:
Ebel, R.L., Essentials of Educational
Measurement
Mehrens, W.A., Lehmann I.J.,
Measurement and Evaluation in
Education and Psychology
Gronlund, N.E., Measurement and
Evaluation in Teaching

Das könnte Ihnen auch gefallen