An effective practical test. This means that it

is not excessively expensive,

stays within appropriate time constraint,
is relatively easy to administer,
has a scoring/evaluation procedure that
is specific and time-efficient.


Reliability means the degree to which an assessment tool
produces stable and consistent results.
 A reliable test is consistent and dependable.
 A test is reliable if:”You give the same test to the same student or matched students
on two different occasions, the test should yield
similar occasions, the test should yield similar results.” (Brown, 2004)

Student-Related Reliability

The most common learner-related issue in reliability is caused by temporary illness, fatigue, a
illness, fatigue, and other physical or psychological factors.


 Inter-rater reliability: When two or more scorers yield inconsistent scores of the same
test. Factors: lack of attention to scoring, inexperience, inattention, etc.
 Intra-rater reliability: Scoring criteria, fatigue, bias toward particular “good” and “bad”
students, or simple carelessness.

Test Administration Reliability

 This involves the condition in which the test is administered.

 Unreliability occurs due to outside interference like noise, variations in photocopying,
temperature, the amount of light in various parts of the room, and even the condition
of the room, and even the condition of desk and chairs.

 Brown (2010) stated that he once witnessed the administration of a test of aural
comprehension in which an audio player was used to deliver items for comprehension,
but due to street noise outside the building, test-taker sitting next to open windows
could not hear the stimuli clearly.

Test Reliability Factors cause unreliability:

 If a test is too long, test takers may become fatigued by the time they reach the later
items and hastily respond incorrectly.


 It is the most complex and important criterion of an effective test.

 Validity is when the assessment results are appropriate, meaningful, and useful in
terms of purpose of the assessment.
 How is the validity of a test established?
1. Content-related Evidence: If you can clearly define the achievement that you are
measuring there is validity. Direct test involves the test-takers in actually performing
the target task. Indirect test is a combination between direct and indirect test.

2. Criterion-related Evidence: It means that the specific classroom objectives measured

have ben reached. 2 Categories: i. concurrent validity: If the results are supported
beyond the assessment itself. ii. Predictive validity: It is important in the case of
placement tests, language aptitude tests etc.

3. Construct-related Evidence: It is any the consequences of the test, including accuracy

in measuring criteria, its impact on preparation of test-takers, its effect on the learner
and its interpretation etc. theory, hypothesis, or model that attempts to explain
observed phenomena in our universe of perceptions.

4. Consequential Validity: It includes all 3.

5. Face Validity: It can be empirically tested by a teacher or even by a testing expert because it
is based on the subjective judgment of the examinees who take it.


 Authenticity in a test in when a task is likely to be enacted or represented in the “real

 In a test, authenticity may be present in the following way:

The language in a test is as natural as possible.

Contextualized items Not in isolation.

Meaningful topics for the learner.

Organization. Real –world tasks.


 It refers to effects the tests have on instruction in terms of how students prepare for
the test. It is also about how Ss' can identify their strengths and weekends. Teachers
can suggest strategies for helping Ss' as part of the guiding process. Feedback is very
important for Ss' improvement.