Sie sind auf Seite 1von 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/329956190

Quantitative analysis of single best answer multiple choice questions in


pharmaceutics

Article  in  Currents in Pharmacy Teaching and Learning · March 2019


DOI: 10.1016/j.cptl.2018.12.006

CITATIONS READS

3 2,449

4 authors:

Suha Al Muhaissen Anna Ratka


University of Jordan St. John Fisher College
15 PUBLICATIONS   34 CITATIONS    4 PUBLICATIONS   5 CITATIONS   

SEE PROFILE SEE PROFILE

Amal Akour Hatim S Alkhatib


University of Jordan University of Jordan
49 PUBLICATIONS   152 CITATIONS    71 PUBLICATIONS   858 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Co-processed Excipient View project

Design and Evaluation of Hydrophobic Chitosan Ion Pair – Metal Composites for Application in Colonic Delivery of Peptide Drugs View project

All content following this page was uploaded by Suha Al Muhaissen on 14 June 2020.

The user has requested enhancement of the downloaded file.


Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Contents lists available at ScienceDirect

Currents in Pharmacy Teaching and Learning


journal homepage: www.elsevier.com/locate/cptl

Research Note

Quantitative analysis of single best answer multiple choice


T
questions in pharmaceutics
Suha A. Al Muhaissena, , Anna Ratkab, Amal Akoura, Hatim S. AlKhatiba

a
The University of Jordan, School of Pharmacy, Queen Rania Street, Amman, 11942, Jordan
b
Wegmans School of Pharmacy, St. John Fisher College, 3690 East Avenue, Rochester, NY 14618, United States

ARTICLE INFO ABSTRACT

Keywords: Introduction: The purpose of this study was to: (1) analyze the quality of single best answer
Multiple choice questions multiple choice questions (MCQs) used in pharmaceutics exams, (2) identify the correlation
Examination quality between difficulty index (DIF I), discriminating index (DI), and distractor efficiency (DE), and (3)
Difficulty index understand the relationship between DIF I, DI, and DE and the number of MCQ answer options
Discriminating index
and their cognitive level.
Distractor efficiency
Methods: 429 MCQs used in pharmaceutics exams were analyzed. The quality of the MCQs was
evaluated using DIF I, DI, and DE. The number of answer options and the cognitive level tested by
each item were evaluated. Relationships between DIF I, DI, and DE were measured using
Pearson's correlations and t-tests.
Results: DIF I showed a significant negative correlation with DI within questions that measured
information recall. A significant negative correlation between DIF I and DI was observed in
questions with four and five answer options regardless of the cognitive level measured. The
highest DI values were found in moderate difficulty questions, while the worst DE was observed
for the easiest questions. Questions that measured analytical and problem-solving abilities were
more difficult than those measuring information recall. Questions with four and five answer
options had excellent discrimination.
Conclusions: Single best answer MCQs are a valuable assessment tool capable of evaluating
higher cognitive skills. Significant correlation between DIF I and DI can indicate the examination
quality. Higher quality MCQs are constructed using four and five answer options.

Introduction

Assessment of students by measuring their performance against established standards requires reliable tools.1,2 Such assessment is
particularly crucial for the evaluation of professional healthcare students to measure learning outcomes and to determine how well
these outcomes meet professional competencies and expectations.1,3–6
Properly constructed multiple choice questions (MCQs) can be very effective in assessing knowledge and higher cognitive levels of
Bloom's taxonomy.3–10 The modified Bloom's hierarchy defines three learning levels: Level I: knowledge and ability to recall in-
formation, Level II: understanding and data interpretation skills constituting comprehension and application, and Level III: problem-
solving, where knowledge and understanding are used to deal with new circumstances.4

Corresponding author.

E-mail addresses: s.muhaissen@ju.edu.jo (S.A. Al Muhaissen), aratka@sjfc.edu (A. Ratka), a.akour@ju.edu.jo (A. Akour),
h.khatib@ju.edu.jo (H.S. AlKhatib).

https://doi.org/10.1016/j.cptl.2018.12.006

1877-1297/ © 2018 Elsevier Inc. All rights reserved.


S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

The most frequently used type of MCQs is the “single best answer” type,11–16 which is composed of a question (stem) and multiple
possible answers (choices), including one correct/best answer and several incorrect options (distractors).
The quality of a test is dependent on the quality of each item.3 Consequently, item analysis provides means for evaluation of test
quality and results.2–6,14,15,17–23 Item analysis can be achieved using both qualitative and quantitative methods.6,13,20
Qualitative analysis is performed by faculty members who are content experts to identify weak elements. After administering a
test, quantitative data can be collected and analyzed. Results from this analysis can help to identify problematic questions and item
defects, such as incorrect key, vague and faulty questions, and questions addressing examiners personal ideas or beliefs. Such issues
are hard to detect by qualitative review.
Quantitative item analysis utilizes two main values; difficulty index (DIF I) and discriminating index (DI).4,5,7,12,14,15,17,19–23 DIF I
is the ratio of the total correct responses to the test item and can range from 0 to 1.6,1014,15,20,22,24 Questions can be categorized in
regards to their difficulty using the DIF I; very difficult questions (DIF I < 0.20), acceptable or good questions (0.2 < DIF I < 0.5),
excellent questions (0.5 < DIF I < 0.8), and very easy or poor items (DIF I > 0.80).14,22
The DI of an item describes its ability to distinguish between students who mastered the subject and those who did not. It is
calculated as a ratio of the difference between the fraction of the students in the upper quartile group (in terms of the overall exam
score) who answered the item correctly and that of the students in the lower quartile who answered it correctly to the total number of
responses including correct, incorrect, and blank responses.3,5–8,15,19,20,22–24 DI can range from −1 to 1. Question quality can be
categorized in regards to discrimination ability using the values of DI; poor with DI < 0.20, (item should be revised or discarded),
acceptable with DI 0.20–0.29 (retain item but revision might be needed), good with DI 0.30–0.39 (subject to further improvement),
and excellent with DI ≥ 0.40.18–20
Another value that has been used to improve the construction of MCQs is the distractor efficiency (DE). DE is the percentage of
functional distractors (FDs), options selected by 5% or more of the students.6,19,20 On the other hand, nonfunctional distractors
(NFDs) are incorrect options selected by < 5% of the participants.6,19,20 For a four answer option MCQ containing three, two, one, or
no NFDs, the DE is 0, ≈33%, ≈67%, and 100%, respectively.12,18–20 The quality of incorrect answers (distractors) in a MCQ has been
reported to influence student's MCQ test performance.20
It is important to note that these metrics do not guarantee the quality of a question nor replace the judgment of the instructor. The
classification of questions by their DIF I, DI, and DE values are used as a flagging mechanism that draws the attention of the instructor
to “potentially” problematic MCQs and prompts him or her to review such questions.
The objectives of this study were to (1) analyze the quality of single best answer MCQs used in pharmaceutics exams, (2) identify
the relationship between DIF I, DI, and DE, and (3) understand the relationship between these indices and the number of answer
options as well as the cognitive level they measure. The literature is replete with item analysis studies, yet the authors are unaware of
a published study presenting extensive analysis of pharmaceutics exam questions including all indices.

Methods

Data collection

Data were collected from twelve midterm and final exams administered in pharmaceutics courses offered to third year students
over three semesters (fall, spring, and summer) of the academic year 2015–2016. A special panel of pharmaceutics faculty members
classified questions based on measured cognitive level (i.e. knowledge, analysis, or calculations). The panel included original
question writers as well as other pharmaceutics faculty members.
A total of 509 MCQs were collected. Exclusion criteria included duplicate questions, questions with more than one correct answer,
“true or false” questions, and questions with two answer options as these questions have 50% success probability. Only 429 questions
were selected for analysis; the MCQs had either three (107 questions), four (108 questions), or five (214 questions) answer options
and addressed three cognitive levels, knowledge (179 questions), analysis (130 questions), and problem-solving (or ‘calculations’)
(120 questions).
The study was exempted from institutional review board review by the University of Jordan School of Pharmacy Scientific
Research Committee (decision number 1/CSR/2017).

Item analysis

The data on each individual item were obtained using an automated grading system AXM 980 (DATAWIN GmbH, Germany).

Statistical analysis

Results were reported as percentages, mean values, and standard deviations (SDs) of n questions. Pearson's correlation coefficient
and t-test were used to evaluate correlations with p-value < 0.05 indicating statistical significance. Statistical analysis was performed
using SPSS 23.0 (IBM, Chicago, IL).25

Results

A total of 429 MCQs with 1624 distractors were analyzed. Table 1 summarizes the characteristics of the assessed questions. The

252
S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Table 1
Characteristics of multiple choice questions (MCQs) test items using difficulty index (DIF I), discriminating index (DI), distractors efficiency
(DE), and distractors.
Range Mean (SD)

DIF I 0.14–1 0.64 (0.19)


DI 0.06–0.98 0.46 (0.17)
DE (%) 0–100 62 (31)

n (%)
Total Number of Nonfunctional Distractors 626 (39)
Total Number of Functioning Distractors 998 (61)

Number of questions with no Functioning Distractors 32 (8)


Number of questions with one Functioning Distractor 70 (16)
Number of questions with two Functioning Distractors 136 (32)
Number of questions with three Functioning Distractors 108 (25)
Number of questions with four Functioning Distractors 83 (19)

DE ≤ 25% 87 (20.3)
25% < DE < 50% 114 (26.6)
50% < DE < 75% 114 (26.6)
DE ≥ 75% 114 (26.6)

Total number of items analyzed = 429; Total number of distractors = 1624.


SD = standard deviation.

questions had a mean DIF I of 0.64 (SD = 0.19), mean DI of 0.46 (SD = 0.17), and mean DE of 62% (SD = 31), with almost 27% (114
of 429) of questions with DE ≥ 50% and 20% (87 of 429) questions with DE ≤ 25%. As shown in Table 1, NFDs represented 39%
(626 of 1624) of distractors assessed. Only 8% (32 of 429) of questions had no FD (DE = 0%) and 19% (83 of 429) had four FDs
(DE = 100%).
Tables 2 and 3 show the distribution matrices of DI categories by DIF I categories and DIF I categories by DI categories, respectively.
Such distribution matrices (contingency tables) are used to provide a basic picture of the interrelation between DIF I and DI.
It is shown in Table 2 that, among the analyzed items, 4% (16 of 429) and 24% (105 of 429) are either difficult (DIF I < 0.2) or
easy (DIF I > 0.8), 36% (156 of 429) had good DIF I (0.2–0.5), and 35% (152 of 429) had excellent DIF I (0.5–0.8). Moreover, 60%
(12 of 20) of questions with poor DI (< 0.2) had DIF I value between 0.8 and 1 (easy by DIF I standards). A total of 86% (238 of 277)
of questions classified to have excellent DI (> 0.4) had DIF I between 0.2 and 0.8 (43% had DIF I 0.2–0.5, and 43% had DIF I
0.5–0.8).
Table 3 shows that 5% (20 of 429) of questions had poor DI (< 0.2), 15% (65 of 429) had acceptable DI (0.2–0.3), and 16% (67 of
429) had good DI (0.3–0.4). The majority of items (64%, 277 of 429) had excellent DI (> 0.4).
Data in Table 3 show that excellent DI (> 0.4) values were associated with DIF I ranging from 0.2 to 0.8; 77% (120 of 156) with
DIF I of 0.2–0.5 (good difficulty) and 77% (118 of 152) in DIF I 0.5–0.8. The calculated Pearson correlation between DIF I and DI was
−0.37 indicating a moderate negative relation between these indices. The negative correlation coefficient between DI and DIF I
suggests that with increased DIF I values there is a decrease in DI. It is worth noting that the negative correlation between the DIF I
and DI is not universal. Questions with very low DIF I can have a very low DI; however, the limited number of such questions in the
sample studied reduces their effect so that the general trend observed was that of an inverse relationship.
Table 4 shows the correlations between DE and DIF I and DE and DI. When studied in relation to the DIF I of the questions, the
average DE was 93.23 ± 12.25% for difficult questions as compared to 29.12 ± 25.50% for easy questions (p < 0.001). A smaller
but still significant difference (p = 0.01) was observed in the DE of questions as a function of DI; 47.00 ± 37.68% in the questions
with poor DI and 69.43 ± 26.68% in the questions with excellent DI.
Table 5 presents relation between DIF I and categories of cognitive level. Questions that measure examinees knowledge were
more frequently easy in comparison to questions that evaluate analysis and calculations; 31% of knowledge questions were classified

Table 2
Distribution matrix of discriminating index (DI) categories by difficulty index (DIF I) categories.
DI
DIF Ia n (%b) Poor Acceptable Good Excellent
(DI < (DI 0.2–0.3) (DI 0.3–0.4) (DI >
0.2) 0.4)

Difficult (DIF I < 0.2) 16 (4) 5% 8% 3% 3%


Good (DIF I 0.2–0.5) 156 (36) 15% 28% 23% 43%
Excellent (DIF I 0.5–0.8) 152 (35) 20% 11% 34% 43%
Easy (DIF I 0.8–1) 105 (25) 60% 53% 40% 11%

a
Pearson's correlation revealed a significant correlation between DIF I and DI (p < 0.001) with r = −0.37.

253
S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Table 3
Distribution matrix of difficulty index categories (DIF I) by discriminating index (DI) categories.
DIF I
DIa n (%b) Difficult Good Excellent Easy
(DIF (DIF I 0.2–0.5) (DIF I 0.5–0.8) (DIF I 0.8–1)
I < 0.2)

Poor (DI < 0.2) 20 (5) 6% 2% 3% 12%


Acceptable (DI 0.2–0.3) 65 (15) 31% 11% 5% 33%
Good (DI 0.3–0.4) 67 (16) 13% 10% 15% 26%
Excellent (DI > 0.4) 277(64) 50% 77% 77% 29%

a
Pearson's correlation revealed a significant correlation between DIF I and DI (p < 0.001) with r = −0.37).

Table 4
Distractor efficiency (DE) of test items categorized by difficulty index (DIF I) and discriminating index (DI).
DE (%)
Mean value (SD)
DI DIF I p-value

Easy DIF I/Excellent DI 69.4 (26.7)a 29.1 (25.5)b < 0.05a


< 0.001b
Excellent DIF I/Good DI 52.1 (30) 62.2 (23.8)b < 0.001
Good DIF/Acceptable DI 47.4 (35.0) 81.7 (19)b < 0.001
Difficult DIF I/Poor DI 47.1 (37.7) 93.2 (12.3)b < 0.001

SD = standard deviation.
a
Statistically significant differences as compared to other DI categories.
b
Statistically significant differences as compared to other DIF I categories.

Table 5
Difficulty index (DIF I) categories distribution and correlation with discriminating index (DI) by the cognitive level.
DIF I
Question type (cognitive level) Difficult Good Excellent Easy Pearson's correlationa
(DIF (DIF I 0.2–0.5) (DIF I 0.5–0.8) (DIF
I < 0.2) I > 0.8)

Knowledge (n = 279) 1% 29% 39% 31% −0.35b


Calculations (n = 70) 9% 47% 26% 18% −0.16
Analysis (n = 80) 8% 54% 32% 6% −0.15

a
Correlation between DIF I and DI for each category.

as easy. A statistically significant (p < 0.001) negative Pearson correlation (−0.35) between DIF I and DI among questions in the
knowledge category was found. However, a weak and statistically insignificant correlation was observed between DIF I and DI for
questions testing higher cognitive abilities, analysis (p = 0.18) and calculations (p = 0.2).
Table 5 also shows comparable percentages of calculations (9%, 6 of 70) and analysis (8%, 6 of 80) questions were classified as
difficult questions, while 47% (33 of 70) and 54% (43 of 80) of analysis and calculations questions, respectively, were of good
difficulty levels. Also 26% (18 of 70) of calculation questions and 32% (26 of 80) of analysis questions were of excellent difficulty.
The majority of the calculations questions (73%, 51 of 70) and analysis type questions (86%, 69 of 80) had good to excellent DIF I.
Further investigations using t-tests demonstrated that calculations questions and analysis questions have significantly lower DIF I
when compared to knowledge questions (p < 0.001). Also, it was found that calculation questions and analysis questions have
significantly higher DI (p = 0.029 and p = 0.018, respectively) than knowledge questions. Moreover, 79% (55 of 70) of the calcu-
lation type questions and 61% (49 of 80) of the analysis type questions had excellent DI (> 0.4).
Table 6 shows that questions with five alternatives had the best discrimination ability; 67% (242 of 364) of these questions had a
DI > 0.4. Among the four option MCQs, 61% (23 of 38) had DI > 0.4. For four and five option MCQs, DIF I related significantly to
DI (p = 0.01 and p < 0.001, respectively). The lowest discrimination was found for three option MCQs; 11% (3 of 27) questions had
DI < 0.2 and 26% (7 of 27) had DI 0.2–0.3. In addition, the majority of the studied four and five option MCQs had either good or
excellent DIF I (76% (29 of 38) and 71% (258 of 364), respectively).

Discussion

A number of previous studies have reported the mean difficulty index (DIF I) in MCQs exams as 0.58 (SD = 0.27),24 0.58
(SD = 0.14),15 0.63 (SD = 0.19),5 and 0.75 (SD = 0.24).19 Mitra et al.14 conducted study of type-A MCQs administered during four
years as multi-disciplinary preclinical summative assessment for medical students and reported DIF I ranging between 0.64 and 0.79

254
S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Table 6
Discrimination index (DI) categories distribution and correlation with difficulty index (DIF I) by the number of answer options.
DI
Number of answer options per question Poor Acceptable Good Excellent Pearson's correlationa
(DI < (DI 0.2–0.3) (DI 0.3–0.4) (DI >
0.2) 0.4)

3 (n = 27) 11% 26% 19% 44% −0.33


4 (n = 38) 3% 11% 26% 61% −0.40b
5 (n = 364) 4% 15% 14% 67% −0.37c

a
Correlation between DIF I and DI for each category.
b
Correlation significant at the 0.05 level (two-tailed).

for nine tests with 40% of total test questions having DIF I values greater than 0.8. The MCQs analyzed in our study had DIF I of 0.64
that is comparable to the values reported by Rao et al.19 while evaluating assessment tools for medical students.
DIF I and DI are correlated reciprocally.5 Any item with a DI value above 0.2 is considered an efficient discriminator. In our study,
the average DI value is similar to DI of 0.38 reported by Chauhan et al.24 for one-single answer MCQs of anatomy examinations.
Analysis of our data showed that only 5% of the studied questions had poor DI (< 0.2), with no questions having negative DI
values indicating a good quality of the evaluated MCQs. The results from this study were comparable to findings published by Rao
et al.19 who reported that 65% of their questions had a DI equal to or more than 0.4. In a study reported by Pande et al.,18 21% of
physiology questions had poor discriminators, 46% with DI between 0.2 and 0.39, 29% with excellent DI values, and 4% with DI
values < 0.
As shown in the current study, maximum discrimination occurred when DIF I was between 0.5 and 0.8, yet, a high extent of
excellent discrimination power is seen for questions of good difficulty. This finding is comparable with ranges of maximum capacity
of differentiation reported in the literature (0.4–0.74,21 0.5–0.7,22 and 0.5–0.7914).
Analysis of MCQs from this study showed a significant (p < 0.001) negative correlation (Pearson factor = −0.37) between DIF I
and DI which is similar to findings published by Mitra et al.14 who reported a moderate negative correlation between DI and DIF I.
Negative Pearson factor signifies that, initially, as the item difficulty is increased its discriminative power increases, until a maximum
value is reached beyond which DI starts to decrease as DIF I increases. The second phase of this correlation can be expected because
an easy question is answered correctly by most of the participants. In some cases, top students fail to choose the correct option due to
over thinking. On the other hand, low scoring students may choose the key answer of difficult questions by chance contributing to
poor discrimination of such questions.3,14,19,20 The average DE in the current study is lower than reported by others. Rao et al.19
reported DE 89.99%. This difference is probably due to a much smaller number of analyzed questions (120 vs. 429) and the fact that
no exclusion was made in regard to the number of options per item. However, the results from the current study are comparable to
literature reports in terms of relationship between DE and NFDs and DIF I and DI.15,19,26 When the mean DIF I and DI values were
determined as a function of the number of NFDs, the DIF I values steadily increased as the number of FDs decreased while the DI
values decreased significantly as NFDs increased. It is important to reach a proper relationship between these metrics because an
ideal item is not only the result of excellent difficulty and discriminating indices, but also FDs (incorrect alternatives) play a great role
in improving the quality of an item.20
The DE of three option MCQs was higher (87 ± 22) than that of four option (66 ± 28) and five option questions (60 ± 31).
Such comparison has not been previously reported in literature. This reflects the fact that it is easier to prepare two functional
distractors in comparison to three or four, and the equal quality of four and five option MCQs of the tested question pool. However,
the overall satisfactory DE of the studied MCQs points to the effort used to build them. The average DE decreased significantly as the
DIF I value increased and DI value decreased. The comparable average DE and the significant Pearson's correlation measured between
DI and DIF I for four option and five option MCQs suggests that these two types of MCQs should be used during test preparation.
The maximum discriminatory power for MCQs composed of five options was the best, followed by MCQs with four options. This
finding is in agreement with studies that favor better discrimination by three distractors.12 However, Chauhan et al.15 and Trivsan
et al.27 suggest (opposite to our findings) that three option MCQs are more discriminative and have fewer questions with non-
performing distractors. Such different conclusions may be caused by the smaller sample size used in these studies (i.e. only one item
had four FDs and five questions had one FD).15 Moreover, in both studies DI was not used as part of the analysis, and the least
discriminating alternative was deleted to create three and four option test format from the original five option item.
However, we did not find a significant Pearson's correlation between DI and DIF I within two categories of questions: analysis and
calculations. The knowledge type MCQs (Level I) were the ones contributing to the significant correlation between DI and DIF I. The
lack of a correlation might be a result of limited students' exposure to this question type and higher number of fact recalling questions.
Questions requiring a higher order of thinking are generally considered difficult and challenging.
In our study, the majority of analysis and calculation questions had good to excellent DIF I and DI. This indicates that these
questions are more difficult yet more discriminative than knowledge questions. This is comparable to findings reported by Kim et al.9
The main limitation of this study was unequal number of questions in different groups after categorization based either on the
number of answer options or the cognitive skills measured by each question. Incomplete knowledge and experience of the exam
writers with implementation of Bloom's taxonomy is also recognized as a limitation of this study.

255
S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Conclusions

The methodology used in this study to evaluate quality of pharmaceutics MCQs was feasible and effective. The number of choices
per one MCQ affected the quality of a question. The statistical analysis revealed a significant correlation between DIF I and DI, and a
clear correlation between DE and DIF I and DI upon grouping MCQs based on the number of choices/options per question. Analysis
and calculations questions are significantly more difficult to develop but are more discriminative than knowledge questions. DE was
evaluated as a function of the number of answer options. This new approach for DE evaluation revealed DE to be independent of the
number of options in our sample indicating equal distractor quality regardless of the number of answer options. The results of this
study show that well-constructed single best answer MCQs with efficient distractors permit effective and objective testing of a large
number of students and assessment of students’ learning within the Bloom's cognitive taxonomy.

Conflict of interest

None

Disclosures

None

Acknowledgments

The authors would like to acknowledge the faculty members at the Department of Pharmaceutics and Pharmaceutical Technology
at the University of Jordan School of Pharmacy for their cooperation.

Supplementary materials

Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.cptl.2018.12.006.

References

1. LOA Glossary. American Public University System website. Updated 2016. http://www.apus.edu/academic-community/learning-outcomes-assessment/glossary.
Accessed 23 December 2018.
2. McApline M, Hesketh I Multiple response questions – allowing for chance in authentic assessments. In: Proceedings of the 7th CAA Conference, Loughborough:
Loughborough University. Published 2003. https://dspace.lboro.ac.uk/2134/1915. Accessed 23 December 2018.
3. Siri A, Freddano M. The use of item analysis for the improvement of objective examinations. Procedia Soc Behav Sci. 2011;29(6):188–197.
4. Palmer EJ, Devitt PG. Assessment of higher order cognitive skills in undergraduate education: modified essay or multiple choice questions? Research paper. BMC
Med Educ. 2007;7. https://doi.org/10.1186/1472-6920-7-49.
5. Mehta G, Mokhasi V. Item analysis of multiple choice questions- an assessment of the assessment Tool. Int J Health Sci Res. 2014;4(7):197–202.
6. Gajjar S, Sharma R, Kumar P, Rana M. Item and test analysis to identify quality multiple choice questions (MCQs) from an assessment of medical students of
Ahmedabad, Gujarat. Indian J Community Med. 2014;39(1):17–20.
7. Phipps SD, Brackbill ML. Relationship between assessment item format and item performance characteristics. Am J Pharm Educ. 2009;73(8) https://doi.org/10.
5688/aj7308146.
8. Considine J, Botti M, Thomas S. Design, format, validity and reliability of multiple choice questions for use in nursing research and education. Collegian.
2005;12(1):19–24.
9. Kim MK, Patel RA, Uchizono JA, Beck L. Incorporation of Bloom's taxonomy into multiple-choice examination questions for a pharmacotherapeutics course. Am J
Pharm Educ. 2012;76(6) https://doi.org/10.5688/ajpe766114.
10. Reichert TG. Assessing the Use of High Quality Multiple Choice Exam Questions in Undergraduate Nursing Education: Are Educators Making the Grade? Sophia,
the St. Catherine University repository website. Published May 2011. https://sophia.stkate.edu/ma_nursing/15/. Accessed 23 December 2018.
11. Sheaffer EA, Addo RT. Pharmacy student performance on constructed-response versus selected-response calculations questions. Am J Pharm Educ. 2013;77(1)
https://doi.org/10.5688/ajpe7716.
12. Tarrant M, Ware J, Mohammed AM. An assessment of functioning and non-functioning distractors in multiple-choice questions: a descriptive analysis. BMC Med
Educ. 2009;9. https://doi.org/10.1186/1472-6920-9-40.
13. Norcini JJ, Swanson DB, Grosso LJ, Webster GD. Reliability, validity and efficiency of multiple choice question and patient management problem item formats in
assessment of clinical competence. Med Educ. 1985;19(3):238–247.
14. Mitra NK, Nagaraja HS, Ponnudurai G, Judson JP. The levels of difficulty and discrimination indices in type a multiple-choice questions of pre-clinical semester 1
multidisciplinary summative tests. Int e-J Sci, Med Educ. 2009;3(1):2–7.
15. Chauhan P, Chauhan GR, Chauhan BR, Vaza JV, Rathod SP. Relationship between difficulty index and distracter effectiveness in single best-answer stem type
multiple choice questions. Int J Anat Res. 2015;3(4):1607–1610.
16. Exam Questions: Types, Characteristics, and Suggestions. Centre for Teaching Excellence, University of Waterloo website. https://uwaterloo.ca/centre-for-
teaching-excellence/teaching-resources/teaching-tips/developing-assignments/exams/questions-types-characteristics-suggestions. Accessed 23 December 2018.
17. Meshkani Z, Abadie FH. Multivariate analysis of factors influencing reliability of teacher made tests. J Med Educ. 2005;6(2):149–152.
18. Pande SS, Pande SR, Parate VR, Nikam AP, Agrekar SH. Correlation between difficulty & discrimination indices of MCQs in formative exam in physiology. South-
East Asian J Med Educ. 2013;7(1):45–50.
19. Rao C, Prasad HLK, Sajitha K, Permi H, Shetty J. Item analysis of multiple choice questions: assessing an assessment tool in medical students. Int J Educ Psychol Res.
2016;2(4):201–204.
20. Sabri S. Item analysis of student comprehensive test for research in teaching beginner string ensemble using model based teaching among music students in public
universities. Int J Educ Res. 2013;1(12):1–14.
21. Sim SM, Rasiah RI. Relationship between item difficulty and discrimination indices in true/false-type multiple choice questions of a para-clinical multidisciplinary
paper. Ann Acad Med Singapore. 2006;35(2):67–71.
22. Suruchi Rana SS. Test item analysis and relationship between difficulty level and discrimination index of test items in an achievement test in biology. Paripex- Ind J

256
S.A. Al Muhaissen, et al. Currents in Pharmacy Teaching and Learning 11 (2019) 251–257

Res. 2014;3(6):56–58.
23. Understanding item analyses. University of Washington Office of Educational Assessment website. Updated 2018. http://www.washington.edu/assessment/
scanning-scoring/scoring/reports/item-analysis/. Accessed 23 December 2018.
24. Chauhan PR, Ratrhod SP, Chauhan BR, Chauhan GR, Adhvaryu A, Chauhan AP. Study of difficulty level and discriminating index of stem type multiple choice
questions of anatomy in Rajkot. Biomirror. 2013;4(6):37–40.
25. IBM SPSS Statistics for Windows [computer program]. Version 23.0. Armonk, NY: IBM Corp; 2015.
26. Hingorjo MR, Jaleel F. Analysis of one-best MCQs: the difficulty index, discrimination index, and distractor efficiency. J Pak Med Assoc. 2012;62(2):142–147.
27. Trevisan MS, Sax G, Michael WB. The effects of the number of options per item and student ability on test validity and reliability. Educ Psychol Meas.
1991;51(4):829–837.

257

View publication stats

Das könnte Ihnen auch gefallen