Sie sind auf Seite 1von 25

Test

Construction
Kimberly R. Reyes, RPm
Guidelines for Item Writing
1. Define clearly what you want to measure.
To do this, use substantive theory as a
guide and try to make items as specific
as possible.

2. Generate an item pool. Theoretically, all


items are randomly chosen from a
universe of item content. Avoid redundant
items.
Guidelines for Item Writing
3. Avoid exceptionally long items, which are
rarely good.

4. Keep the level of reading difficulty


appropriate for those who will complete the
scale.

5. Avoid double-barreled items which convey


two or more ideas at the same time.
Guidelines for Item Writing
6. Consider mixing positively and negatively worded
items. Sometimes, respondents develop the
acquiescence response set. This means that the
respondents will tend to agree with most items
without fully understanding them.

Example:
Adamson College Depression Scale
1. I feel sad most of the time. (-)
2. I feel hopeful about the future. (+)
Item Formats
Dichotomous Format
Offers two alternatives for each item. Usually a
point is given for the selection of one of the
alternatives.

Advantages: simple, easy to administer, can be


scored quickly, requires absolute judgment.

Disadvantages: if used in ability tests, it encourages


memorization, test of luck and response style;
forced choice
Item Formats
Polytomous/Polychotomous Format
Offers more than two alternatives. Typically, a point is
given for the selection of one of the alternatives and no
point is given for selecting any other choice.

Advantages: easy to administer and score, uses


distractors or incorrect choices, requires absolute
judgment, reduces guessing

Disadvantages: if used in ability tests, it can encourage


test of luck and response style, forced choice
Item Formats
Likert Format
Requires the respondent indicate the degree of
agreement with a particular attitudinal question.

Instead of asking for a yes/no reply, five alternatives are


offered: strongly agree, agree, neutral, disagree and
strongly disagree.

Positively worded items are normally scored while


negatively worded items are reversely scored.
Item Formats
Category Format
Similar to the Likert format but uses a scale which holds
greater number of choices.

Example: People can rate items on a 10-point scale

Variation: visual analogue scale the respondent is


given a 100-centimeter line and asked to place a mark
between two well-defined end points.
Item Formats
Checklist
The subject receives a long list of descriptive or adjectival
statements and indicates whether each one is characteristic of
him/herself or someone else.

Q-sorts
Combination of the checklist format and the category format;
the subject is given statements and asked to sort them into 9
piles.
Statements that are least descriptive of the person are placed
on Pile 1 while those that are most descriptive are placed on
Pile 9.
Item Formats
Variation of the Q-sort format
Item Formats
Spiral omnibus format
A format used in ability tests wherein the items are
arranged from easy to difficult.

Examples:
VOCABULARY TEST ARITHMETIC TEST
Define the following words Get the results of the ff. equations
1. shoes 1. 12 + 7
2. jump 2. 50 9
3. computer 3. 12 x 15
4. rotate 4. 473 / 11
5. parasol 5. (261 + 90) 46
6. mimic 6. (837 41) x (63 + 37)
Item Analysis
It is a general term for a set of methods
used to evaluate test items in order to
come up with a cluster of valid and reliable
test items.

The basic methods involve assessment of


item difficulty and item discriminability.
Item Analysis
Item-difficulty index (p)
It is defined by the number of people who get
a particular item correct.

The higher the proportion of people who get


the item correct, the easier the item.
Np Np = number of test takers who
got the item correct
Formula: p =
N N = total number of test takers
Item Analysis
Interpretation of Item-difficulty index (p)
Item-difficulty index (p) Interpretation
0.81 and above Very easy
0.61 0.80 Easy
0.41 0.60 Optimum
0.21 0.40 Difficult
0.20 and below Very difficult
Item Analysis
Examples of Item-difficulty index (p)
Item # Np N p Interpretation
1 36 50 36/50 = 0.72 Easy

2 14 50 14/50 = 0.28 Difficult

3 45 50 45/50 = 0.90 Very Easy

4 23 50 23/50 = 0.46 Optimum

5 29 50 29/50 = 0.58 Optimum


Item Analysis
Item-discriminability index (d)
It determines whether the people who have done well
on particular test items have also done well on the
whole test.

An item can have a positive or negative discriminating


power. The higher the d value, the better the test item.
Up & Lp = number of test takers in the
upper and lower groups who got the item
Up - Lp correct
Formula: d =
U U = total number of test takers in the
upper group
Item Analysis
Item-discriminability index (d)
For this calculation, we divide the test takers into
three groups according to their scores on the test
as a whole:
an upper group consisting of the 27% who make the
highest scores (U)
a lower group consisting of the 27% who make the
lowest scores (L)
a middle group consisting of the remaining 46%. (M)
Item Analysis
Interpretation of Item-discriminability index (d)

Item-discriminability Interpretation
index (d)
0.40 and above Very good item
0.30 0.39 Good item
0.20 0.29 Fair item
0.09 0.19 Poor item
0.08 and below Very poor item
Item Analysis
Example of item-discriminability index (d)
N = 50
U (27%) = 13.5 or 14
L (27%) = 13.5 or 14

Item # Up Lp d Interpretation
1 11 6 (11-6) / 14 = 0.36 Good
2 9 8 (9-8) / 14 = 0.07 Very Poor
3 14 5 (14-5) / 14 = 0.64 Very Good
4 7 9 (7-9) / 14 = -0.14 Very Poor
5 12 9 (12-9) / 14 = 0.21 Fair
Preparing for Item Analysis
Arrange test scores from highest to lowest.

Get 27% of the papers from the highest


scores and another 27% of the papers from
the lowest scores.

Record separately the number of times the


correct answer was chosen by the test takers
in each group.
What To Do After Item Analysis
When should a test item be rejected? Retained?
Modified or revised?

A test item can be retained if its level of difficulty is easy,


optimum, or difficult and discriminating power is fair to very
good.

It has to be rejected if it is either very easy or very difficult and


its discriminating power is poor, negative, or zero.

An item can be modified if its difficulty level is optimum and its


discriminating power is negative.
Sample Item Analysis
N = 50
U (27%) = 13.5 or 14
L (27%) = 13.5 or 14

Item Np N p Int. Up Lp d Int. Decision


#
1 34 50 0.68 E 13 12 0.07 VP Reject

2 19 50 0.38 D 9 5 0.29 F Retain

3 10 50 0.20 VD 6 1 0.36 G Reject /


Revise
4 27 50 0.54 O 11 7 0.29 F Retain

5 44 50 0.88 VE 14 12 0.14 P Reject


Creating Norms
Once you have completed getting the raw
scores of your sample, create a table
showing the equivalent Z score, percentile
and stanine for each score. Include a
qualitative interpretation of the scores.
Raw Score Z score Percentile Stanine Qualitative
Interpretation
73 2.2 98.61 9 Very High

61 1.5 93.32 8 High

54 1.1 86.43 7 Above Average

42 0.5 69.15 6 Average


Assignment
N = 70
U (27%) = 18.9 or 19
L (27%) = 18.9 or 19

Item Np N p Int. Up Lp d Int. Decision


#
1 45 70 13 4

2 21 70 11 2

3 65 70 19 16

4 33 70 7 8

5 52 70 15 7
Any Questions?

In deep waters, you


encounter only the wise
and the brave; in
shallow waters, the
ignorant and the
coward!

- Mehmet Murat Ildan

Thank You!

Das könnte Ihnen auch gefallen