You are on page 1of 65

ASSESSMENT

OF LEARNING
Teaching to the Test
“Superficial forms of
assessment tend to lead to
superficial forms of
teaching and learning.”

-- Edutopia: Success Stories


for Learning in the Digital Age
Why Assess?
• Provide diagnosis

• Set standards

• Evaluate progress

• Communicate results

• Motivate performance
Standardized Tests
• Are not prescriptive

• Give capsulated view of a


student’s learning

• Used in conjunction with


performance-based assessment
Authentic Assessment
• Observation

• Teacher-made tests, quizzes,


exams

• Written compositions
Authentic Assessment
• Oral presentations

• Projects, experiments, performance


tasks

• Portfolios
• TEST – the instrument or systematic
procedure
• It answer the question: “How does
individual student performs?”
• TESTING – method used to measure
the level of achievement or
performance of the students
• MEASUREMENT – process of
obtaining a NUMERICAL
DESCRIPTION. It answer the
question: “How much?” The score.
• EVALUATION – judging the
performance through a descriptive
rating (satisfactory, VS, O or
excellent).
TYPES OF MEASUREMENT
• NORM-REFERENCED TEST –
comparison with other student using a
score in PERCENTILE, GRADE or
EQUIVALENT SCORE or STANINE
• Purpose: to rank student with respect to
the achievement of others and to
discriminate high and low achievers.
• CRITERION-REFERENCED TEST
– To measure performance with respect to a
particular criterion or standard
– Student score is expressed as
PERCENTAGE and student achievement is
reported for individual skills
– Purpose: to know whether the student
achieved a specific skills or concepts, and to
find out how much students know before
instruction begins and after it has finished
– Objective referenced, domain referenced, and
universe referenced
TYPE OF EVALUATION
• PLACEMENT –prerequisite skills, degree of mastery and
the best mode of learning
• DIAGNOSTIC – to determine level of competence,
identify students with previous knowledge of the lesson
and to know the causes of learning problems and to
formulate plans for remedial action
• FORMATIVE – to provide feedback, identify learning
errors needing corrections and for teacher to modify
instruction and for improving learning and instruction
• SUMMATIVE – to determine if objectives have been
met, for assigning grades and effectiveness of
instructions.
MODES OF ASSESSMENT
• TRADITIONAL – multiple choice, fill-in the
blanks, true or false, matching type
• PERFORMANCE – responses,
performances and products
• PORTFOLIO – collaboration of student’s
work, contains a purposeful selected
subset of student work
KEY TO EFFECTIVE TEACHING
• OBJECTIVES – aims of instruction
• INSTRUCTION – elements of the
curriculum designed to teach the subject
includes lesson plans, study guides and
assignments
• ASSESSMENT – testing components of
the subject
• EVALUATION – extent of understanding
of the lesson
INSTRUCTIONAL OBJECTIVES
• Guides for teaching and learning
• Intent of the instruction
• Guidelines for assessing learning
• Behavioral objectives clearly describe an
anticipated learning outcomes
• Specific, measurable, attainable, realistic
and time bound
Bloom’s Taxonomy

A Focus on Higher-Level
Thinking Skills
Background
In 1956, Benjamin Bloom, a professor at the
University of Chicago, shared his famous "Taxonomy
of Educational Objectives."

Bloom identified six levels of cognitive complexity that


have been used over the past four decades to make
sure that instruction stimulates and develops students'
higher-order thinking skills.
Higher-Level Thinking Skill
Evaluation

Synthesis

Analysis

Application

Comprehension

Knowledge
Knowledge
Recall or recognition of information.

define list
classify name
describe identify
locate show
outline define
give examples recognize
distinguish opinion from fact recall
match
Comprehension
The ability to understand, translate, paraphrase, interpret or extrapolate material. (Predict outcome and effects).

summarize paraphrase
explain differentiate
interpret demonstrate
describe visualize
compare restate
convert rewrite
distinguish give examples
estimate
Application
The capacity to use information and transfer knowledge from one setting to another. (Use learned material in a new situation).

solve apply
illustrate classify
calculate modify
interpret put into practice
manipulate demonstrate
predict compute
show operate
Analysis
Identifying detail and having the ability to discover and differentiate the component parts of a situation or information.

analyze contrast
organize compare
deduce distinguish
choose categorize
diagram outline
discriminate relate
Synthesis
The ability to combine parts to create the big picture.

design discuss
hypothesize plan
support compare
write create
report construct
combine rearrange
comply compose
develop organize
Evaluation
The ability to judge the value or use of information using appropriate criteria. (Support judgment with reason).

evaluate criticize
choose justify
estimate debate
judge support your reason
conclude
defend
assess
appraise
rate
KRATWOHL’S AFFECTIVE
TAXONOMY
• Refers to a person’s awareness and
internalization of objects and simulation
• ANDERSON and KRATWOHL – revised
the Bloom’s original taxonomy by
combining the cognitive process and
knowledge dimensions from lowest level
to highest level
• Receiving – listens to ideas, identify, select, give
• Responding – answers questions about ideas:
read, select, tell, write, assist, present
• Valuing – think about how to take advantage of
ideas, able to explain them well; explain, follow
initiate, justify, propose
• Organizing – commits to using ideas, incorporate
them to activity: prepare, follow, explain, relate,
synthesize, integrate, join , generalize
• Characterization – putting and using them into
practice: solve, verify, propose, modify, practice,
qualify
Illustrative Behavioral Terms for
stating Specific Learning Outcomes
• RECEIVING • RESPONDING
– Asks – Answers
– Chooses – Assists
– Describes – Complies
– Follows – Conforms
– Gives – Discuss
– Holds – Greets
– Identifies – Helps
– Locates – Label
– Names – Perform
– Point to – Practice
– Selects – Present
– Replies – Read
– Uses – Recites
– Reports
– Select
• VALUING • ORGANIZATION
– Completes
– Alters
– Describes – Arranges
– Differentiates – Combines
– Explains – Compares
– Follows – Completes
– Form – Defends
– – Explains
Initiates
– Generalizes
– Invites – Integrates
– Justifies – Modifies
– Propose – Orders
– Read – Organizes
– Reports – Prepares
– Relates
– Select
– Synthesizes
– Shares
– Studies
– work
PSYCHOMOTOR DOMAIN
• OBSERVING – active mental attending to
physical event
• IMITATING – attempted copying of a
physical behavior
• PRACTICING – trying a specific activity
over and over
• ADAPTING- fine tuning, making minor
adjustment in the physical activity in order
to perfect it.
CRITERIA WHEN
CONSTRUCTING A GOOD TEST
• VALIDITY – what is intended to measure
• RELIABILITY – consistency of the score obtained when
the test is repeated
• ADMINISTRABILITY – easy, clarity and uniformity: time
limit and instructions
• SCORABILITY – easy to score and the directions for
scoring is clear and simple, provision of answer sheets
are made.
• ECONOMY – test should be given in the cheapest way
and can be given from time to time
• ADEQUACY – wide sampling of items to represent of the
areas measured
• AUTHENTICITY – stimulating and real life situations.
Table Of Specifications
• Determine the total item desired
• Determine the number of days taught for each
lesson and its total
• Divide the # of days taught for each topic by the
total # of days taught for all topics multiplied by
the total item
• Distribute the # of questions to all levels of
cognitive domain
• Identify the test item number placement in the
test
ITEM ANALYSIS
• Analysis of students response to each
item in the test being desirable and
undesirable.
• Desirable item can be retained for
subsequent use.
• Undesirable item can be revised or
rejected
Criteria of an Item

• Difficulty of an item
• Discriminating power of an item
• Effectiveness of an item
Steps of Item Analysis
• Arrange the scores from highest to lowest
• Select the 27% of the papers within the upper
group and 27% from the lower group
• Set aside the 46% of the papers, they will not be
used
• Tabulate the number of students in the UG and
the LG who selected each choices
• Compute for the difficulty of each item
• Evaluate the effectiveness of the distracters
Difficulty Index (DF)
• Proportion of the number of students in
the upper and lower groups who
answered an item correctly
UG + UL
DF = --------------
N
Interpretation
Index of difficulty Item Evaluation
0.86 – 1.00 Very easy
0.61 – 0.85 Moderately easy
0.36 – 0.60 Moderately difficult
0.00 – 0.35 Very difficult
Index of Discrimination DI
• The difference between • Positive Discrimination –
the proportion of high more students in the
performing students who upper group got the item
got the item right and the right
proportion of low • Negative Discrimination –
performing students who more students in the
got an item right. lower group got the item
RU - RL right
DI = ------------ • Zero Discrimination –
equal number of students
N in both groups got the
item right
Interpretation
DI Item Evaluation
0.40 – up Good item
0.30 – 0.39 Reasonably good but subject
to improvement
0.20 – 0.29 Marginal item, needs
improvement
below 0.19 Poor item need to be
rejected or revised
• Maximum Discrimination (DM) – the sum of the
proportion of the upper and lower groups who
answered the item correctly.
DM = UG +LG
• Discrimination Efficiency (DE) – the index of
discrimination divided by the maximum
discrimination DI
DE = --------
DM
Distracter’s Effectiveness
• A good distracter attracts students in the
lower group than in the upper group.
• Poor distracter attracts more students in
the upper group
• This provides information for improving the
item.
• No. of examinees = 84
• 1/3 or 27% from the highest = 28
• 1/3 or 27% from the lowest = 28
• Item #4
• Options *a b c d
UG – 28 26 2 0 0
UL – 28 10 17 1 0
*correct choice
Index of Difficulty = UG + UL/N
= 36/56
= 0.64 moderately easy
Index of Discrimination = RU – RL/N
= 26 – 10/56
= 0.29 marginal item needs improving
Option b function effectively as a distracter because it attracts
more from the lower group. Options c and d are poor
distracters because none from each group is attracted.
VALIDITY – what is supposed to be
measured
• CONTENT VALIDITY – content and
objectives
• CRITERION-RELATED VALIDITY – test
scores relating other test instruments
• CONSTRUCT VALIDITY – test can
measure on observable variable
• PREDICTIVE VARIABLE – test result can
be used what will be the score of a person
at a later time
FACTOR AFFECTING VALIDITY
• Poorly constructed test items
• Unclear directions
• Ambiguous items
• Reading vocabulary too difficult
• Complicated syntax
• Inadequate time limit
• Inappropriate level of difficulty
• Unintendend clues
• Improper arrangement of items
RELIABILITY –consistency of
measurement
• FACTORS AFFECTING RELIABILITY:
– Length of test
– Moderate item difficulty
– Objective scoring
– Heterogeneity of the student group
– Limited time
DESCRIPTIVE STATISTICS
• MEASURES OF CENTRAL TENDENCY –
AVERAGES
• MEASURES OF VARIABILITY – SPREAD
OF SCORES
• MEASURES OF RELATIONSHIP -
CORRELATION
MEASURE OF CENTRAL
TENDENCY - MEAN
• Easy to compute
• Each data contributes to the mean value
• Affects by the extreme values easily
• Applied to interval data
∑x
• Mean = -------
n
MEDIAN
• The point that divides the scores in a
distribution into 2 equal parts when the
scores are arranged from highest to lowest
• If the # of the score is ODD, the value of the
median is the MIDDLE SCORE
• When the # of scores is an EVEN #, the
median value is the average of the 2 middle
most scores
MODE
• Refers to the score/s that occurred most in
the distribution
• Unimodal if the distribution consist of only
1mode
• Bimodal if the distribution contains 2
modes
• Multimodal if a score distribution consist of
more than 2 modes
MEASURES OF VARIABILITY
• A single value that is used to describe the
spread out of the scores in a distribution,
that is above or below the measures of
central tendency
• Range
• Quartile deviation
• Standard Deviation
RANGE
• Simplest and crudest measure
• A rough measure of variation
• The smaller the value, the closes to each
other
• The larger the value, the more scattered
the scores are
• The value easily fluctuate
• R = HV - LV
QUARTILE DEVIATION
• Is the half of the difference between the third
quartile (Q3) and the first quartile (Q1)
• The value of the QD indicates the distance we
need to go above or below the median to include
approximately the middle 50%of the scores
Q3 – Q 1
• QD = -------------
2
• The standard deviation for Math is 10.20
and for science is 10.10, which means that
MATH scores has a greater variability than
SCIENCE scores, which means the
scores in MATH are more scattered than
in SCIENCE
• SD value LARGE = scores will be FAR
from the mean
• SD value SMALL = scores will be CLOSE
from the MEAN
PERCENTILE RANK
• The %age of scores in the frequency
distribution which are lower, meaning the
%age of examinees in the norm group
scored below the score of interest.
• Used to clarify the interpretation of scores
on standardized tests
• score = 66 = 90th percentile, meaning 90%
of the examinees got score lower than 66.
Z SCORE – STANDARD SCORE
• Measures HOW MANY SD’s an
observations is ABOVE or BELOW the
MEAN.
• +Z score measures the no. of sd a score
is ABOVE the MEAN
• -Z score measures the no. of sd score is
BELOW the MEAN
• To locate the student’s score at the base
of the curve
Z score Formula
T-score

• T-score = 10z + 50
• The higher the value indicates good
performance in a test
CV

The LOWER the value of


coefficient of variation, the
MORE the overall data
approximate to the MEAN or
the MORE
HOMOGENEOUS THE
PERFORMANCE OF THE
GROUP.
SKEWNESS
• Describes the degree of departures of the
data from symmetry.
• The degree of skewness is measured by
the coefficient of skewness, denoted as
SK
• SK = 3(Mean – Median)
SD
NORMAL CURVE

Abnormal
Imbecile, Above Abnormal
Below AVERAGE
Moron, Border Ave. VS Genius
Idiot
-line Average (S)
0.63
O 0.63
POSITIVELY SKEWED
• The curve is skewed
to the RIGHT, it has a
LONG tail extending
off to the right with a
short tail to the left.
• When the computed
value of SK is + most
of the scores of
students are VERY
LOW, they performed
poor in the exam.
NEAGTIVELY SKEWED
• When the distribution is
skewed to the left, it has
a long tail extending to
the left but a short tail to
the right.
• When computed value of
SK is negative most of
the students got a very
high score, they
performed WELL in the
exam.
COEFFICIENT OF CORRELATION