My Test and Item Analysis Report

Test and Item Analysis Report
By
F Rufetu
Presented as a Major Assignment

In
Computer-based Assessment (CIA 722)
September 2007
1
Table of contents
Table of contents i
List of tables ii
List of figures iii
1. Introduction 1
2. Purpose of report 1
3. Test analysis 1
3.1 Descriptive statistics 1
3.2 Frequency graph 1
3.3 Test reliability 3
4. Item analysis 4
4.1 Difficulty index 4
4.2 Discrimination index 5
5. Conclusion 6
References 7
Appendix A 8
Appendix B 9
Appendix C 10
2
List of tables
Table Description
Table 1 Mode, median, mean and standard deviation 1
Table 2 Grouped frequency table 2
Table 3 Cumulative frequency table 2
Table 4 Determining reliability coefficient (KR20) 4
Table 5 Calculation of difficulty index 4
Table 6 Calculation of discrimination index 5
Table 7 Number of students in upper and lower group 6
3
List of figures
Figure Description
Figure 1 Cumulative frequency graph 2
Figure 2 Frequency histogram 3
Figure 3 Frequency polygon 3
4
1. Introduction
This is a report on test and test items analysis using descriptive statistics
(measure of tendency and variability) for a given set of scores. Twenty
five students wrote a multiple choice test containing twenty questions with
four distracters each, (see appendix A).
2. Purpose of report
The purpose of this report is to disseminate information pertaining to test

and item analysis for a given set of scores.
3. Test analysis
Test analysis examines how the items perform as a set. According to

Kubiszyn and Borich (2007), “no test you construct will be perfect”,
meaning it includes invalid or deficient items. This necessitates analysis.
3.1 Descriptive statistics
From the test data (see appendix B), the mode occurs more frequently,
the median is the score that splits a distribution by half, the mean is an
average of a group of scores and standard deviation is the estimate of
variability given by the square root of the sum of (x-Mean)2 over the
number of students.
The mode, median, mean and standard deviation are given in table 1.
The table shows a normal distribution because the mode, median and
mean is the same.
Table 1: Mode, median, mean and standard deviation
Mode Median Mean Standard

deviation
3.2 Frequency graphs
65 65 65.79 21.90
The frequency graphs are determined by having a grouped frequency
table first, given in table 2.
Table 2: Grouped frequency table
H 100
5
L 15
Range 85
Number of
Intervals 10
Size of interval 8.5
The cumulative frequency graph is determined by upper values as x-axis

and cumulative frequency as y-axis. Cumulative frequency table is shown
in table 3.
Table 3: Cumulative frequency table
Lower Upper Middle Cumulative

Limit Limit Value Frequency Frequency
15 24 19.5 1 1
25 34 29.5 2 3
35 44 39.5 0 3
45 54 49.5 4 7
55 64 59.5 3 10
65 74 69.5 6 16
75 84 79.5 1 17
85 94 89.5 6 23
95 104 99.5 2 25
The cumulative frequency graph is given in figure 1. An ‘ogive’ shape is

formed.
Figure 1: Cumulative frequency graph
Cumulative frequency
30
25
Cumulative
20
15
10
5
0
24 34 44 54 64 74 84 94 104
Upper values
The frequency histogram is determined by intervals (lower values)

as x-axis and frequency as y-axis. The frequency histogram is given in
figure 2.
6
Figure 2: Frequency histogram
Frequency histogram
7
6
5
Frequency
4
3
2
1
0
15-24 25-34 35-44 45-54 55-64 65-774 75-84 85-94 95-104
Intervals
The frequency polygon is determined by middle values as x-axis and

frequency as y-axis. The frequency polygon is given in figure 3.
Figure 3: Frequency polygon
Frequency polygon
7
6
5
Frequency
4
3
2
1
0
19.5 29.5 39.5 49.5 59.5 69.5 79.5 89.5 99.5
Middle values
3.3 Test reliability
Reliability coefficient (KR20) is the appropriate index of test reliability for

multiple choice tests. The coefficient is determined by means of a formula
which includes the number of test items (k), student performance on
every item (sum of pq), for pq values (see appendix C) and the
standard deviation squared (stddev2) for the set of student test scores.
The index ranges from 0.00 to 1.00. The larger the number the more
reliable the student scores are. The (KR20) is determined by means of
values given in table 4.
7
Table 4: Determining reliability coefficient (KR20)
k 20
k-1 19
Total pq 3.83
stdev 21.90
stddev2 479.57
KR20 1.04
Reliability coefficient (KR20) =1.04. This is a reliable number because it is

large (almost 1.00). The student scores are reliable.
4. Item analysis
Item analysis can be used to identify items that are deficient in some way
so as to improve or even eliminate them.
Matlock-Hetzel (2007) states that item analysis “investigates the

performance of items considered individually in relation to the remaining
items in the test”.
4.1 Difficulty index
This indicates the proportion of students who answered the item correctly.
The proportion (p) equals number of students with correct answer over
number of students who attempted the item. If p<0.25 it means the item
is too difficult, and if p>0.75 then the item is too easy and therefore
unacceptable.
Calculation and interpretation of difficulty index for each question is given

in table 5.
Table 5: Calculation of difficulty index
Questions #Correct #Answered p Interpretation Reason

Too
q1 21 25 0.84 Unacceptable easy
Too
q3 17 25 0.68 Acceptable Fine
Too
Table 5: Calculation of difficulty index (continued)
8
Questions #Correct #Answered p Interpretation Reason

Too
Too
Too
Too
9
Too
4.2 Discrimination index
According to Special Connections (2007), the discrimination index (D) is a

“basic measure of item’s ability to discriminate between those who scored
high (#u) on the total test and those who scored low (#L)”.
If D value is positive (closer to 1.00) there is a strong relationship
between performance on that item and overall test performance. This
means the discrimination is fine. If D value is negative this suggests poor
validity for an item. The distracters must be looked into.
Calculation and Interpretation of discrimination index for each question is

given in table 6. In this instance all items indicate a positive
discrimination.
Table 6: Calculation of discrimination index
Questions #U #L D Interpretation
q1 15 6 0.60 Fine
q2 15 7 0.53 Fine
q3 14 3 0.73 Fine
q4 8 4 0.27 Fine
Table 6: Calculation of discrimination index (continued)
Questions #U #L D Interpretation
q5 15 6 0.60 Fine
q6 12 5 0.47 Fine
q7 9 2 0.47 Fine
q8 10 2 0.53 Fine
q9 10 3 0.47 Fine
q10 8 0 0.53 Fine
q11 14 9 0.33 Fine
q12 14 5 0.60 Fine
q13 12 3 0.60 Fine
q14 15 6 0.60 Fine
q15 14 6 0.53 Fine
q16 15 7 0.53 Fine
q17 12 3 0.60 Fine
q18 5 3 0.13 Fine
10
q19 12 1 0.73 Fine
q20 11 5 0.40 Fine
The number of students in upper and lower group is the measure of ability
of an item to discriminate among students who have a high score on the
test and those with a low score on the test. It is the difference between
the correct responses in the upper group and of the correct responses in
the lower group. The number of students in upper and lower group is
given in table 7.
Table 7: Number of students in upper and lower group
#Upper 15
#Lower 10
5. Conclusion
In conclusion, since the (KR20) is reliable, while sixty percent of the items
under difficulty index are acceptable and the discrimination index is
positive on all items, the overall test is valid.
Analysis of response options allow educators to fine tune and improve

items they may wish to use again with future classes. If items are too
difficult teachers can adjust the way they teach. The greater the number
of plausible distracters, the more accurate, valid and reliable the test
becomes.
References
Kuiszyn, T. and Borich, G. (2007). Educational Testing and Measurement:

Classroom Application and Practice, p (204-326). Eighth edition. John
Wiley & Sons, INC. USA.
Matlock-Hetzel, S. (2007). Basic Concepts in Item and Test Analysis. Texas

A & M University. Retrieved October 02 2007, from
http://ericae.net/ft/tamu/Espy.htm
Special Connections. (2007). Retrieved October 02 2007, from

http://www.Specialconnections.ku.edu/cgi-
bin/cgiwrap/cpecconn/print.php?path=page/ass..
11
Appendix A
Key C B D D B C D A C B A C B D A A C D B C
St
No Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20
1 C B B A C D A D D A D A A A A C B D B
2 C B D D B D A A C B A C B D A A C D B C
3 C B D D B C D A C B A C B D A A C B D C
4 C B D B B C B A C B A C A D C A C B C C
5 C B D C B C B A C D A C B D A A A B B C
6 C A D D C C A D C D A C A D A A A B D C
7 B B A B B C B B D D A C B D C A A D D C
8 C B D B B C B D B C A C B D A A C A B A
9 C B D A B C D D B D A C B D A A C B D A
10 C B B A B C D C D C A B A D D A C D B C
11 C B D D B C D A C B A C B D A A C D B C
12 C B D D B C D D D A A C A D A A C B B D
13 C B D A B C D A C B A C B D A A A B B C
14 C B D A B C D A C B A C B D A A A B C
15 C B D D B B A A B D A C D A A C B B D D
16 C B D D B C D A C B A C B D A A C D B C
17 B B C C B A D D C A D B D A C A D
18 C B B D B A D D D D A C A D A A C B B C
19 D C A D B A B A D C C D A A D B B B A B
20 C B D D B C D A C A C D B D A A C D B C
21 C A D D C C A D C D A C A D A A A B D C
22 B B A B B C B B D D A C B D C A A D D C
23 C B D B B C B D B C A C B D A A C A B A
24 C B B A C D A D D A D A A A A C B D B
25 C B D D B D A A C B A C B D A A C D B C
12
Appendix B
x Group x-Mean (x-Mean)2

100.00 U 34.21 1170.49
100.00 U 34.21 1170.49
90.00 U 24.21 586.24
90.00 U 24.21 586.24
90.00 U 24.21 586.24
89.47 U 23.69 561.03
85.00 U 19.21 369.12
85.00 U 19.21 369.12
75.00 U 9.21 84.87
70.00 U 4.21 17.74
70.00 U 4.21 17.74
65.00 U -0.79 0.62
65.00 U -0.79 0.62
65.00 U -0.79 0.62
65.00 U -0.79 0.62
60.00 L -5.79 33.50
55.00 L -10.79 116.37
55.00 L -10.79 116.37
50.00 L -15.79 249.25
50.00 L -15.79 249.25
47.06 L -18.73 350.77
45.00 L -20.79 432.12
31.58 L -34.21 1170.23
31.58 L -34.21 1170.23
15.00 L -50.79 2579.38
13
Appendix C
Pro Pro
correct incorrect
Question #Correct #Answered (p) (q) pq
q1 21 25 0.84 0.16 0.13
q2 22 25 0.88 0.12 0.11
q3 17 25 0.68 0.32 0.22
q4 12 25 0.48 0.52 0.25
q5 21 25 0.84 0.16 0.13
q6 17 25 0.68 0.32 0.22
q7 11 25 0.44 0.56 0.25
q8 12 23 0.52 0.48 0.25
q9 13 25 0.52 0.48 0.25
q10 8 24 0.33 0.67 0.22
q11 23 25 0.92 0.08 0.07
q12 19 25 0.76 0.24 0.18
q13 15 25 0.6 0.4 0.24
q14 21 25 0.84 0.16 0.13
q15 20 25 0.8 0.2 0.16
q16 22 24 0.92 0.08 0.08
q17 15 24 0.63 0.38 0.23
q18 8 24 0.33 0.67 0.22
q19 13 25 0.52 0.48 0.25
q20 16 25 0.64 0.36 0.23
14

My Test and Item Analysis Report

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

My Test and Item Analysis Report

Hochgeladen von

Copyright:

Verfügbare Formate

Test and Item Analysis Report

Presented as a Major Assignment

List of figures iii

Table 1 Mode, median, mean and standard deviation 1

Table 2 Grouped frequency table 2

Table 3 Cumulative frequency table 2

Table 4 Determining reliability coefficient (KR20) 4

Table 5 Calculation of difficulty index 4

Table 6 Calculation of discrimination index 5

Table 7 Number of students in upper and lower group 6

Figure 1 Cumulative frequency graph 2

Figure 2 Frequency histogram 3

Figure 3 Frequency polygon 3

The purpose of this report is to disseminate information pertaining to test

Test analysis examines how the items perform as a set. According to

3.1 Descriptive statistics

Table 1: Mode, median, mean and standard deviation

Mode Median Mean Standard

Table 2: Grouped frequency table

The cumulative frequency graph is determined by upper values as x-axis

Table 3: Cumulative frequency table

Lower Upper Middle Cumulative

The cumulative frequency graph is given in figure 1. An ‘ogive’ shape is

Figure 1: Cumulative frequency graph

The frequency histogram is determined by intervals (lower values)

The frequency polygon is determined by middle values as x-axis and

Figure 3: Frequency polygon

3.3 Test reliability

Reliability coefficient (KR20) is the appropriate index of test reliability for

Reliability coefficient (KR20) =1.04. This is a reliable number because it is

Matlock-Hetzel (2007) states that item analysis “investigates the

4.1 Difficulty index

Calculation and interpretation of difficulty index for each question is given

Table 5: Calculation of difficulty index

Questions #Correct #Answered p Interpretation Reason

q6 17 25 0.68 Acceptable Fine

4.2 Discrimination index

According to Special Connections (2007), the discrimination index (D) is a

Calculation and Interpretation of discrimination index for each question is

Table 6: Calculation of discrimination index

Table 6: Calculation of discrimination index (continued)

Table 7: Number of students in upper and lower group

Analysis of response options allow educators to fine tune and improve

Kuiszyn, T. and Borich, G. (2007). Educational Testing and Measurement:

Matlock-Hetzel, S. (2007). Basic Concepts in Item and Test Analysis. Texas

Special Connections. (2007). Retrieved October 02 2007, from

x Group x-Mean (x-Mean)2

Das könnte Ihnen auch gefallen