Beruflich Dokumente
Kultur Dokumente
5.
5.4
Marking
Mention has previously been made of the importance of marker reliability, that is the ability of
an assessment process to provide almost the same mark to a piece of work, regardless of which
examiner marked it and on which occasion it was marked (see appendix A.3). There are three
main ways of ensuring this. First, it is important to appoint and retain only those examiners who
can mark consistently and objectively. The great majority of DP examiners are experienced
teachers of the programme. Such examiners are ideally suited to the task, through having prior
familiarity with the course being taught and its assessment requirements, and some knowledge
of the expected standards. Second, all examiners, except the most senior one for each
component, have their marking checked every examination sessionpast good performance
does not guarantee what may happen in the next session. This procedure is called moderation
and is described in section 5.5. The third method considered in more detail here, is to provide
examiners with comprehensive instruction on how to go about marking. This can be done
through prior training, an area of activity that is planned for much greater development by the
IBO in the near future, through electronic means. Diploma Programme examiners receive detailed
instruction about the administrative procedures to be followed that will allow successful
moderation to take place, and also substantial information about how to allocate marks.
The IBO uses two principal methods of guiding examiners in the allocation of marks: analytic
markschemes and assessment criteria (which itself has a variant called markbands).
The approach used in DP assessment in the application of criterion achievement levels is a best
fit model. The examiner or teacher applying an assessment criterion must choose the
achievement level that overall best matches the piece of work being marked. It is not necessary
for every detailed aspect of an achievement level to be satisfied for that level to be awarded,
and it is worth noting that the highest level of any given criterion does not represent perfection,
in a way that the maximum mark on an analytic markscheme probably would (analytic
markschemes operate over a much greater mark range than do assessment criteria).
5.4.3 Markbands
It is sometimes not judged appropriate to separate out the different assessment criteria that can
be applied to a particular piece of work. Assessment criteria are most appropriately applied in
relative independence from each other, with candidate performance on one criterion not
influencing performance on the others. In practice, this can rarely be fully achieved, but if a
situation arises where it is not possible to discern separable assessment criteria, then a different
approach is adopted. This may also be necessary where the work to be assessed is so variable
that a set of criteria, each of which is readily applicable to all responses, cannot be derived. In
such cases markbands are used instead of separate criteria. The markbands, in effect, represent a
single holistic criterion applied to the piece of work, which is judged as a whole. Because of the
requirement for a reasonable mark range along which to differentiate candidate performance,
each markband level descriptor will correspond to a number of marks.
The descriptors themselves tend to be fairly lengthy, covering a range of potential qualities
evident in candidates work, and will again relate directly back to the course objectives.
Examples of markbands can be found in the Diploma Programme History guide (IBO, 2001a).
As with assessment criteria, a best fit approach is used, with markers additionally making a
judgment about which particular mark to award from the possible range for each level descriptor,
according to how well the candidates work fits that descriptor. For example, one markband level
may cover the range 6 to 10 marks. The examiner will give a mark from that range according to
how well the candidates work fits the relevant level descriptor from the markband scale.
Research has shown that, where holistic (markband) and assessment criteria methods of marking
have been applied to essay work that is amenable to both marking methods, there is little
difference between the two in terms of reliability of marking (Wood, 1991, Ch 5).