Beruflich Dokumente
Kultur Dokumente
Written by:
ALIYA IZET BEGOVIC YAHYA
EVI MALA WIJAYANTI
NAFRIANTI
RAIHANAH PERMATA SARI
A. Summary
Assessment is an ongoing process that encompasses a much wider domain than a test
(Brown, 2004, p. 4). It is meant that all students’ performance, whether written or spoken work.
It does imply that the teacher does not only rely on testing score but also the things happen along
the classroom are being assessed.
Related with assessment, Brown explores the principles of assessment which are divided
into five types: practicality, reliability, validity, authenticity and washback. These five principles
should be applied in assessment.
First, practicality related to factors such as cost, time, administration, and
scoring/evaluation (Brown, 2004, p. 19). It refers to the relationship between the resources that
will be required in the design, development and use of the test and the resources that will be
available for assessment.
Second, reliability refers to the extent which a test produces consistent scores at different
administrations to the similar group of test takers. Reliability are divided into 4 types which are
student-related reliability, rater reliability, test administration reliability and test
reliability. The first type of reliability, student-related reliability refers to psychological and
physical factors including illness, fatigue and bad day which can affect the true score of the test-
takers and brings out ‘observed score’. It infers to students’ performances are not fully
administered during the tests. The second type of reliability, rater reliability is divided into two
types which can affect the assessment: inter-rater reliability and intra-rater reliability. These two
types related to the rater’s internal and external factors affect the assessment. The third type of
reliability, test administration refers to the conditions that triggers in which test is administered
such as noisy sound, the amount of light, variations in temperature, etc. The last type of
reliability, test reliability is meant that the tests should be fit into the time constraints, not too
long or short and it also should be clearly written.
Third, validity is about the extent to which inferences made from assessment results are
appropriate, meaningful and useful in terms of the purpose of the assessment. In order to
establish validity, we have to consider five types of validity: content validity, criterion validity,
construct validity, consequential validity and face validity. The first type of validity, content
validity is about the relations between what the tests actually matters and the conclusions are to
be drawn from it. For example, in assessing listening, the teacher can use multiple-choice test.
The second type of validity, criterion validity is referred to the extent to which performance on a
test is related to criterion which the indicator of ability being tested. For example we can obtain
criterion validity in communicative classroom test if test scores are added to communicative
measures of grammar points. Criterion validity falls into two categories: concurrent validity
and predictive validity. It refers to the test scores are supported by other concurrent
performance whereas predictive validity refers to a prediction of a test-taker’s likelihood of
future success. The third type of validity, construct validity is about the extent to which a test
actually taps into the theoretical construct (theory) as it has been defined. For example,
proficiency and communicative competence are linguistic constructs. The fourth type of validity,
consequential validity is about the positive or negative consequences of a particular test. The last
type of validity, face validity is the extent to which students view the assessment as fair, relevant,
and useful for improving learning. It is meant that the validity is all about test-taker’s point of
view so it becomes more subjective than other types of validity.
Fourth, authenticity refers to the degree of correspondence of the characteristics of a
given language test task to the features of a target language task and then suggest an agenda for
identifying those target language tasks and for transforming them into valid test items. In short,
the task will be valid if it is likely to be enacted in the real world. Therefore, in a test,
authenticity may be present in the following ways: the test should be natural as possible; items
are contextualized; topics are meaningful for the learner; some thematic organization to items is
provided through a story line or episode; and tasks represent closely real-world tasks.
Fifth, washback refers to the effects the tests have on instruction in terms of how
students prepare for the test. It can be said as a facet of consequential validity. This validity also
refers to the effects of an assessment on teaching and learning prior to the assessment itself, such
as preparation before tests. In enhancing washback, the teacher should consider to comment
generously and specifically on test performance which we call it as feedback. It is much better
than single letter grade or numerical score in a test.
Here some books that also discussed about the principles of assessment. The books are:
Anderson, L. W. 2003. Classroom assessment. London: LEA Publisher
Earl, L.M & Katz, M. S. 2006. Rethinking classroom assessment with purpose in mind:
Assessment for learning, assessment as learning, assessment of learning. Manitoa
Education
Russel,l, M.K. 2012. Classroom assessment: Concept and application. McGraw Hill
Stufflebeam, D.L & Coryn, C. L. S. 2014. Evaluation theory, models, and applications.
Jossey-Bass
To know more detail about the principles from those authors, in the below the summary of each
books.
No Name Definition
1 Assessment a process of collecting, synthesizing, and interpreting
information in order to make a decision. Depending on the
decision being made and the information a teacher needs in
order to inform that decision, testing, measurement, and
evaluation often contribute to the process of assessment.
2 Testing a formal, systematic procedure used to gather information
about student’s achievement or other cognitive skills
3. Measurement a process of quantifying or assigning a number to a
performance or trait. The example is when a teacher scores a
quiz or test.
4. Evaluation a product of assessment that produces a decision about the
value or worth of a performance or activity based on
information that has been collected, synthesized, and
reflected on.
a. STANDARDIZED ASSESSMENT
Administered, scored, and interpreted in the same way for students, regardless of where
and when they are assessed. The main reason for standardizing assessment procedures is to
ensure that the testing conditions and scoring procedures have a similar effect on the
performance of students in different schools and states.
b. NON-STANDARDIZED ASSESSMENT
Constructed for use in a single classroom with a single group of students. It majorly
focused in the single classroom. It's important to know that standardized tests are not necessarily
better than non-standardized ones. Standardization is only important when information from an
assessment instrument is to be used for the same purpose across many different classroom and
location.
(Anderson, 2003) (Lorna & Katz, (Russell & Airasian, 2012) (Stufflebeam,
2006) Daniel, L., & Coryn,
Chris, 2014)
1. Anderson used 1. The same term 1. The book emphasizes 1. Overall
terms of quality ‘quality’ by not only in classroom coherence
information in Anderson is used assessment, but also to 2. Tested
referring to in here the action after that: hypotheses
principles 2. Earl and Katz Decision making; concerning how
2. The terms of added two 2. The book provides the evaluation
objectivity is principles which important things procedures
overlapped with are reference related to classroom produce desired
Brown’s point and record- assessment: why, outcomes
definition of keeping what, when and how 3. Ethical
reliability. the classroom requirements
3. The book mainly assessment need to be 4. A general
discuss issues conducted; framework for
about classroom 3. The book discuss the guiding program
assessment. So it issues of classroom evaluation
is more practical assessment. One of practice and
than Brown. them is the ethics of conducting
assessment research on
program
evaluation
E. Conclusion
In conclusion, five of books are well written and well conceptualized. The way to explain
the principles can be understood by beginning-level and advanced teachers. Overall, the concept
of the principles of assessment are same, how to evaluate in appropriate way. The Brown’s book
is given the theoretical basic in the principles of assessment rather than others book. On the other
hand, the rest of books can be useful to enrich knowledge about the principles of assessment
from various perspectives that can help develop the assessment appropriately. Each book has
strengths and weaknesses. However, they can complete the gap among them. It is recommended
to read all of those books.
References
Anderson, L. W. (2003). Classroom Assessment: Enhancing the Equality of Teacher Desicion
Making.
Brown, D. (2004). Language Assessment Principles and Classroom Practices. Longman.
Lorna, E., & Katz, S. (2006). MB: Rethinking classroom assessment with purpose in mind.
Assessment for learning, assessment as learning, assessment of learning. Retrieved from
http://www.wncp.ca/media/40539/rethink.pdf
Russell, M. K., & Airasian, P. W. (2012). Classroom assessment. Concepts and applications.
Stufflebeam, Daniel, L., & Coryn, Chris, L. S. (2014). Evaluation, Theory, Models &
Applications (2nd ed.). John Wiley & Sons.