Sie sind auf Seite 1von 5


I observed Teacher X conducting test to assess listening and speaking skills among
Year 1 pupils. The paper and pencil assessment was done as the continuation of a lesson
which topic was about Parts of Body. First, teacher activated pupils prior knowledge by asking
them to point at their body parts while saying the parts names. This was actually a diagnostic
assessment done at the beginning of a term or a unit of study, or whenever information about
the prior learning of a student is useful (Classroom Assessment pg 3). The assessment tools
used here were observation and perception. Through pupils participation and involvement,
teacher found out what pupils knew and could do. For example, when teacher showed the
picture of nose, all pupils said nose, pointed at their noses and not any other body part. So,
pupils activated their prior knowledge about body parts names and their location in their bodies.
Pupils also understood the uses of body parts such as nose is to smell and ear is to hear. They
did the actions of smelling and hearing; displaying their strengths and at the same time teacher
focused on the LINUS pupils and guided them with the actions.
Next, teacher continued assessing pupils through performance assessment. McTighe
and Ferrara (1998) define performance assessment as an assessment activity that requires
pupils to construct a response, create a product, or perform a demonstration. Performance
assessments are concerned with how pupils apply the knowledge, skills, strategies, and
attitudes that they have learned to new and authentic tasks. They may be content-specific or
interdisciplinary and relate to real-life application of knowledge, skills, and strategies. Teacher
showed few sentence strips while pupils read them. The strips contained sentences relating to
body parts such as I have one nose to smell and I have two eyes to see. While reading,
pupils pointed to their body parts and did the actions. This showed that they were able to relate
their knowledge of body parts and understanding of their purposes in real life situations.
Since reading was the integrated skill, pupils were then randomly assessed through the
selective assessment task. Pupils were given pictures of body parts and the sentence strips

which were shown earlier. Then, they came forward and matched the pictures with the correct
sentence strips by pasting them on the whiteboard. After that activity, teacher and pupils sang
Head, shoulder, knees and toes song. These types of assessments were accessible to all
pupils including the LINUS ones. According to Thurlow, M. L., Laitusis, C. C., Dillon, D. R.,
Cook, L. L., Moen, R. E., Abedi, J., & OBrien, D. G. (2009), larger scale assessments ease
teachers to recognize pupils participation and measure their achievement standards.
Although pupils with disabilities (LINUS) presented particular challenges because their
disabilities sometimes interfere with their performance on the assessments. But these selective
assessment tasks seemed to be as accessible as possible to the widest range of pupils. It is
because variety of approaches ensures that assessments produce similar experiences for all
pupils in ways that produce valid results. Therefore, to obtain valid results for pupils with diverse
characteristics, the similar experiences must be more flexible and broader than the general
Paper and pencil assessment took place at the final stage of lesson. Pupils were given
teacher- made test comprising pictures and sentences. Teacher asked pupils to write their
names and date on the paper. She also explained the instruction that is to match the pictures
with correct sentences. This proves that the test paper had face validity. The concept of face
validity according to Heaton (1975: 153) and Brown (2004: 26) is that when a test item looks
right to other testers, teachers, moderators, and test-takers. In addition, it appears to measure
the knowledge or abilities it claims to measure. In that test, pupils were measured their
knowledge of body parts and the purpose of having them in real life. Face validity is important in
maintaining test takers motivation and performance (Heaton, 1975; 153; Weir, 1990: 26). Pupils
expressed positive attitude and interest during the test. They quickly took their pencils and
rulers and began to match the pictures to sentences. In addition, the test was clearly doable
within the allotted time limit. Pupils were given 20 minutes to complete the test. Totally there
was 6 test items including numbered 1 to 6; each number comprising of a sentence and a

picture of body part parallel to it.
Since the test items related to their lessons content thus it has content validity. The test
had content-related evidence because it represented the materials taught before so that the
pupils could draw conclusions from the materials (Weir, 1990: 24; Brown, 2004: 22; Gronlund
and Waugh, 2009: 48). Besides that, pupils were also tested indirectly through these test items.
Although the test was intended to test listening and speaking, they were indeed tested indirectly
through reading and matching. A situation where learners do not perform the task itself but
rather a task that is related in some way in the indirect testing is part of understanding the
content validity (Brown, 2004: 23). Previously their listening speaking skills were assessed
through songs and actions. However, in this paper and pencil assessment they were assessed
the same content but through reading and matching task. Individually, pupils matched the
pictures of body parts to its correct sentences using pencil and ruler.
Furthermore, a test must be authentic. Bachman and Palmer (as cited in Brown, 2004:
28) defined authenticity as the degree of correspondence of the characteristics of a given
language test task to the features of a target language. Test conducted by the teacher was
authentic considering its target language which is English, contained contextual items,
meaningful and interesting topic to pupils and items organized thematically. Most importantly,
the test was based on pupils real world atmosphere. Topic entitled Parts of Body intrinsically
motivated pupils because they portrayed high curiosity during the lesson and test. They were
interested in knowing knowledge about their body parts. Indeed, it was significant for them to
know and understand about the body parts names, quantity and the purpose of having them in
real life. Therefore, pupils could practice the knowledge in their present time and future wisely.
In fact, the teacher made test was practical in terms of time, cost and energy.
Researchers believe that dealing with time, cost and energy, tests should be efficient in terms of
making, doing, and evaluating (Heaton, 1975: 158-159; Weir, 1990: 34-35; Brown, 2004: 19-20).
Then, the tests must be affordable. The test designed by Teacher X contained only 1 page of

test items. Pupils were given photocopies of the test paper since they are more affordable than
printed ones. The test items in it were type written while pictures were taken from other sources.
They are easily synced and designed into a test paper without relying much on teachers energy
and cost. It can be photocopied again in future to be given to pupils so that pupils can be
assessed again using the same test at anytime. Therefore, the test conducted was really time,
energy and cost saving.
Although there were lots of plus points in the test, few weaknesses were still spotted.
While the test was being administered, I noticed that pupils were given guidance to complete
the test items. Previously they did a matching activity where pupils came forward to paste the
picture cards and sentence strips on the whiteboards. During the test, the teacher did not
remove them from the board. In this case, it looked much more like a further practice rather than
a test. So, pupils matched the test items in the test paper by referring to the board. This error
affected the face validity of the test. Brown (2004: 27) states that face validity will likely be high if
learners encounter a difficulty level that presents a reasonable challenge. In this test paper, the
items were just a repetition if their knowledge. Instead, teacher could have added some other
body parts and sentences describing them. Then, those new input would have surely
challenged their cognitive level and stimulate learning while increasing the face validity of the
Another error to be highlighted is the reliability of the test. Reliability refers to
consistency and dependability. According to Heaton (1975: 155-156) and Brown (2004: 21-22),
a same test delivered to a same pupil across time administration must yield same results. Point
to be noted here is that, Teacher X did not include any score or mark for the items on the test
paper. So, rater reliability might cause difficulty for other teachers to administer the test in future.
Besides, test administration reliability too affects the test employed by her. Pupils actually
copied the answers from whiteboard and everyone got all items correct. Thus, there are
chances for pupils to depend on teachers answer if the same test administered in different

occasion. The results may vary too because some pupils might not have understood the
knowledge in the previous lesson. To overcome this defect, test-retest or re- administer should
be carried out. This time score indicators should be provided below the test items. Then, the
same test must be administered after a lapse of time. Teacher should also pay special attention
to not provide any form of answer or guidance for pupils. Finally, those two gained scores
should be correlated and teacher must give oral and written feedback to pupils.

Apart from all those principles stated, washback also plays a crucial role in improving the
tests effect in teaching and learning. Washback can be negative and positive (Saehu, 2012:
124-127). Positive washback was achieved by almost all the pupils in the class. Pupils were
able to activate their knowledge and stimulate learning about their body parts. It enhanced
pupils intrinsic motivation, autonomy and self-confidence. They were able to identify their own
body parts and apply the knowledge and skill through song performance at the end of lesson.
Instead of giving letter grades and numerical scores, teacher awarded compliments, generous
and specific comments such as good work, awesome and great to enhance washback
(Brown 2004: 29). Teacher also approached pupils and guided them by doing the actions.

Even though positive washback was achieved, negative washback still persisted after
the test. The LINUS pupils were not assessed on the topic and skill. Thus, Saehu (2012) stated
that narrowing down language competencies only on those involve in tests and neglecting the
rest is a negative washback. Those LINUS pupils did some other activity such as learning basic
alphabets and colouring letters. They were obviously neglected by the teacher because they
were not allowed to take up any type of test. So, to solve this issue, teacher should prepare
separate test items according to their ability and language level. Those items should reflect on
the topic of the day which was Parts of Body. Indeed, preparing specific test items for them
could boost their motivation in learning and help them to achieve positive washback effect.