2012 MUBALAGTAS Assessment of Learning - CONTENT UPDATE - 1 PDF

PROFESSIONAL READINESS FOR THE
LICENSURE EXAMINATION FOR TEACHERS (LET) 2011
ASSESSMENT OF LEARNING
(Topic)
DR. MARILYN UBINA-BALAGTAS

PROF. MARILOU M. UBINA
(Writers-Facilitators)
LET COMPETENCIES TARGETED

1. Apply principles in constructing and interpreting traditional and alternative forms
assessment.
2. Utilize processed data and results in reporting and interpreting learners’
performance to improve teaching and learning.
3. Demonstrate skills in the use of techniques and tools in assessing affective
learning.
I.CONTENT UPDATE
BASIC CONCEPTS
Learning/Assessment
Objectives
(desired traits/targets)
Traditional Alternative
(Pen-and-paper Test) (Performance/Portfolio/
Affective)
Measurement
(quantification of the traits)
Assessment
(gathering and organization of data)
Evaluation
(interpreting and judging the data and making decisions out of them)
PRINCIPLES OF HIGH QUALITY CLASSROOM ASSESSMENT
1. Assessment should be based on clear and appropriate learning targets.

2. Assessment should be based on appropriate methods.
3. Assessment should be balanced.
4. Assessment should be valid.
5. Assessment should be reliable.
6. Assessment should be fair.
7. Assessment should be continuous.
This material should not be reproduced or used by other lecturers in other LET Review Centers/Institutions
without the permission of the writer of this material.
PROFESSIONAL EDUCATION/ Assessment and Evaluation of Learning 2
Dr. Marilyn U. Balagtas/Prof. Marilou M.Ubiña
8. Assessment should be authentic.

9. Assessment should be practical and efficient.
10. Assessment targets and standards should be made public.
11. Assessment should have positive consequences.
12. Assessment should be ethical.
PURPOSES OF CLASSROOM ASSESSMENT
1. Assessment FOR Learning – this includes three types of assessment done

before and during instruction.
a. Placement – done prior to instruction
• Its purpose is to assess the needs of the learners to have basis in
planning for a relevant instruction.
• Teachers use this assessment to know what their students are
bringing into the learning situation and use this as a starting point for
instruction.
• The results of this assessment place students in specific learning
groups to facilitate teaching and learning.
b. Formative – done during instruction

• It is this assessment where teachers continuously monitor the students’
level of attainment of the learning objectives (Stiggins, 2005)
• The results of this assessment are communicated clearly and promptly to
the students for them to know their strengths and weaknesses and the
progress of their learning.
c. Diagnostic – done during instruction

• This is used to determine students’ recurring or persistent difficulties.
• It helps formulate a plan for detailed remedial instruction.
2. Assessment OF Learning – this is done after instruction. This is usually

referred to as the summative assessment.
• It is used to certify what students know and can do and the level of their
proficiency or competency.
• It is used to grade the students and whose results are communicated to
the students, parents, and other stakeholders for decision making.
• It is also a powerful factor that could pave the way for educational
reforms.
Learning/Assessment Targets (Mc Millan, 2007; Stiggins, 2007)
Target Description
Knowledge student mastery of substantive subject matter
Reasoning student ability to use knowledge to reason and solve

problems
Skills Student ability to demonstrate achievement-related skills
Products Student ability to create achievement-related products
Affective/ Student attainment of affective states such as attitudes,
Disposition values, interests and self-efficacy.
This material should not be reproduced or used by other lecturers in other LET Review 2
Centers/Institutions without the permission of the writer of this material.
Learning Targets and their Appropriate Assessment Methods
Assessment Methods
Targets Performanc Oral Observatio Self-
Objective Essay
e Based Question n Report
Knowledge 5 4 3 4 3 2
Reasoning 2 5 4 4 2 2
Skills 1 3 5 2 5 3
Products 1 1 5 2 4 4
Affect 1 2 4 4 4 5
Note: Higher numbers indicate better matches ( e.g. 5 = high, 1 = low)
Different Methods of Assessment
Objective Objective Performance Oral Observatio Self-

Essay
Supply Selection Based Question n Report
Short Multiple Restricted Presentations Oral Informal Attitude

Answer Choice Papers Examinations Formal Survey
Completion Matching Response Projects Conferences Sociometric
Test True/False Extended Athletics Interviews Devices
Demonstrations Questionnaires
Response Exhibitions Inventories
Portfolios
Modes of Assessment
Mode Description Examples Advantages Disadvantages

The paper-and Standardized Scoring is Preparation of
pen- test used in and teacher- objective the instrument
Traditional assessing made tests Administration is is time
knowledge and easy because consuming
thinking skills students can take Prone to
the test at the guessing and
same time cheating
A mode of Practical Test Preparation of the Scoring tends to
Performance assessment that Oral and Aural instrument is be subjective
requires actual Test relatively easy without rubrics
demonstration of Projects, etc. Measures behavior Administration is
skills or creation of that cannot be time consuming
products of deceived
learning
A process of Working Measures students Development is
Portfolio gathering multiple Portfolios growth and time consuming
indicators of Show Portfolios development Rating tends to
student progress to Documentary Intelligence-fair be subjective
support course Portfolios without rubrics
goals in dynamic,
ongoing and
collaborative
process.
Types of Test
Educational Psychological
(Measures results of instruction) (measures traits not attributed to
instruction alone)
Standardized Teacher-Made
(experts-made) (teacher-made)
Verbal Non-verbal
(words) (symbols and numbers)
Individual Group
(one at a time) (many at one time)
Norm-referenced Criterion-Referenced
(one vs others) (one vs criterion)
Selective Supply
(with choices) (no choices)
Power Speed
(items in increasing difficulty with (items have same difficulty taken with time
no time limit) limit)
Objective Subjective
(yield consistent results) (yield with different results)
Types of Tests According to Format
1. Selective Test - provides choices for the answer.

a. Multiple choice - consists of a stem which describes the problem and 3 or
more alternatives which give the suggested solutions. One of the
alternatives is the correct answer while the other alternatives are the
distractors.
b. True-False or Alternative Response - consists of declarative statement
that one has to mark true or false, right or wrong, correct or incorrect,
yes or no, fact or opinion, and the like.
c. Matching Type - consists of two parallel columns: Column A, the column
of premises from which a match is sought; Column B, the column of
responses from which the selection is made.
2. Supply Test
a. Short Answer - uses a direct question that can be answered by a word, a
phrase, a number, or a symbol.
b. Completion Test - consists of an incomplete statement
3. Essay Test and the Scoring Rubrics
a. Restricted Response - limits the content of the response by restricting the
scope of the topic
b. Extended Response - allows the students to select any factual
information that they think is pertinent and to organize their answers in
accordance with their best judgment.
General Suggestions in Writing Test

1. Use test specifications as guide to item writing.
2. Construct more test items than needed to have extra items
when making decisions as to which items have to be
discarded or revised.
3. Have test of sufficient length to adequately measure the target
performance (Note: the longer the test, the more reliable it is.).
4. Write the test items well in advance of the testing date to have
time for face and content validity.
5. Write the test items with reference to the test objectives.
6. Write each test item in appropriate reading level.
7. Write a test item in a way that it does not become a clue to
other test items.
8. Write a test item whose answer is one that would be agreed upon
by the experts.
9. Write test item in the proper level of difficulty (difficulty index
vary from 0-1, 0.2-0.8 are average items, 0-0.19 are difficult
items, 0.81 to 1.0 are easy items).
10. Have items that could discriminate the bright from poor pupils
(discrimination index of 0.3 to 1.0 have good discriminatory
power).
Specific Suggestions
A. Multiple Choice
Have: Avoid:
A clear problem Double negatives in the stem
Stems that are meaningful Irrelevant information in the
Negatively stated stem only stem
when significant learning Having patterns in the answers
outcomes required it but Verbal clues in the stem and the
highlight the negative word. correct answer
Plausible distracters Alternative like “all of the above”
Alternatives that are specially when it is the correct
grammatically parallel to the answer.
stem Alternative like “none of the
Only one correct and clearly above” when there are many
best answer possible distracters to the
Choices that are arranged correct answer.
alphabetically, according value Answers that are relatively
or length longer than the alternatives
Stems and options that are on Using MC when there are better
the same page. test formats for the test
objectives.
B. Alternative-Response Test
Have: Avoid:
meaningful items trivial statements
simple sentences long sentences unless cause-
only one correct and clearly best and-effect relationships.
answer use of obviously negative words
equal or approximately equal or double negatives in an item.
number for a choice to be a two ideas in one statement
correct answer unless cause-effect
relationships are being
measured
opinionated ideas unless you
acknowledge the source or
unless the ability to identify
opinion is being specifically
measured.
C. Matching Type
Have: Avoid:
unequal number of responses clues or patterns for the correct
and premises, and instruct the answer
pupils that responses may be different or heterogeneous items
used once, more than once, in a single exercise
or not at all. redundant items
list of items to be matched that breaking the whole match into
are brief two pages
the shorter responses at the
right
responses arranged in logical
order.
directions indicating the basis for
matching the responses and
premises.
a maximum of 15 items per
match
D. Supply Objective Test

Have: Avoid:
item/s that require brief and clues or patterns for the correct
specific answer or unit. answer
a direct question is generally statements taken directly from
more desirable than an textbooks as a basis for short
incomplete statement. answer items.
Blanks for answers equal in too many blanks in a single item
length. Blanks at the beginning of the
sentence.
The answers written before the
item number for easy
checking.
E. Essay Test
Have: Avoid:
item/s that target/s high-level Items that simply require recall
thinking skills of facts
questions that specifies clearly items that are taken directly from
the behavior of the learning textbooks
outcome optional questions that vary in
levels of difficulty or items
items that all students could
fairly answer regardless of
their religion, gender, or social
status.
rubric in scoring the work, which
is given to the students as a
guide in answering the
question
Criteria to Consider when Constructing Good Test Items
A. Validity - is the degree to which the test measures what is intended to

measure. It is the usefulness of the test for a given purpose. It is the
most important criterion of a good examination. A validity coefficient
should be at least 0.5 but preferably higher.
Factors Influencing the Validity of the Tests in General

1. Appropriateness of Test - it should measure the abilities, skills and
information it is supposed to measure
2. Directions - it should indicate how the learners should answer and
record their answers
3. Reading Vocabulary and Sentence Structures - it should be
based on the intellectual level of maturity and background
experience of the learners.
4. Difficulty of Items - it should have items that are not too difficult
and not too easy to be able to discriminate the bright from slow
pupils. The acceptable index of difficulty is 0.2-0.8, higher than 0.8
means too easy, lower than 0.2 means too difficult. The acceptable
index of discrimination is 0.3 – 1.0. lower than 0.3 means poor
discriminatory power.
5. Construction of Test Items - it should not provide clues so it will
not be a test on clues nor ambiguous so it will not be a test on
interpretation.
6. Length of the Test - it should just be of sufficient length so it can
measure what it is supposed to measure and not that it is too short
that it cannot adequately measure the performance we want to
measure.
7. Arrangement of Items - it should have items that are arranged in
ascending level of difficulty such that it starts with the easy so that
the pupils will pursue on taking the test.
8. Patterns of Answers - it should not allow the creation of patterns in
answering the test.
Ways in Establishing Validity

1. Face Validity - is done by examining the physical appearance of the
test
2. Content Validity - is done through a careful and critical examination
of the objectives of the test so that it reflects the curricular
objectives.
3. Criterion-related Validity - is established statistically such that a
set of scores revealed by a test is correlated with the scores
obtained in another external predictor or measure. Has two
purposes.
a. Concurrent validity - describes the present status of the
individual by correlating the sets of scores obtained from two
measures given concurrently.
b. Predictive validity - describes the future performance of an
individual by correlating the sets of scores obtained from two
measures given at a longer time interval.
4. Construct Validity - is established statistically by comparing
psychological traits or factors that theoretically influence scores in a
test.
a. Convergent Validity – is established if the instrument defines a
similar trait e.g. Critical Thinking Test that is being developed may
be correlated with a Standardized Critical Thinking Test.
b. Divergent Validity – is established if an instrument can describe
only the intended trait and not the other traits. e.g. Critical Thinking
Test may not be correlated with Reading Comprehension Test.
B. Reliability - it refers to the consistency of scores obtained by the same
person when retested using the same instrument or one that is parallel to
it. Reliability coefficient should at least be 0.7 but preferably higher.
Factors Affecting Reliability
1. Length of the Test - as a general rule, the longer the test, the
higher the reliability. A longer test provides a more adequate
sample of the behavior being measured and is less distorted by
chance factors like guessing.
2. Difficulty of the Test - ideally, achievement tests should be
constructed such that the average score is 50 percent correct and
the scores range from near zero to near perfect. The bigger the
spread of the scores, the more reliable the measured difference is
likely to be. A test is reliable if the coefficient of correlation is not
less than 0.85.
3. Objectivity - can be obtained by eliminating the bias, opinions or
judgments of the person who checks the test.
Type of
Statistical
Method Reliability Procedure
Measure
Measure
1. Test- Measure of Give a test twice to the same group Pearson r
Retest stability with any time interval between tests
from several minutes to several years
2. Equivalent Measure of Give parallel forms of tests with Pearson r

Forms equivalence close time intervals between forms.
3. Test-retest Measure of Give parallel forms of tests with Pearson r
with stability and increased time intervals between
Equiva equivalence forms.
lent
Forms
4. Split Half Measure of Give a test once. Score equivalent Pearson r

Internal halves of the test e.g. odd- and and
Consistency even-numbered items. Spearman
Brown
Formula
5. Kuder- Measure of Give the test once then correlate Kuder-
Richar Internal the proportion/percentage of the Richardson
dson Consistency students passing and not passing a Formula 20
given item. and 21
C. Administrability - the test should be administered with ease, clarity and

uniformity so that scores obtained are comparable. Uniformity can be
obtained by setting the time limit and oral instructions.
D. Scorability - the test should be easy to score such that directions for
scoring are clear, the scoring key is simple; provisions for answer sheets
are made.
E. Economy - the test should be given in the cheapest way, which means
that answer sheets must be provided so the test can be given from time
to time.
F. Adequacy - the test should contain a wide sampling of items to determine
the educational outcomes or abilities so that the resulting scores are
representatives of the total performance in the areas
PERFORMANCE-BASED ASSESSMENT
Performance-Based Assessment is a process of gathering information
about student’s learning through actual demonstration of essential and observable
skills and creation of products that are grounded in real world contexts and
constraints.
Reasons for Using Performance-Based Assessment

• Dissatisfaction of the limited information obtained from selected-response
test.
• Influence of cognitive psychology, which demands not only for the learning of
declarative but also for procedural knowledge.
• Negative impact of conventional tests e.g., high-stake assessment, teaching
for the test
• It is appropriate in experiential, discovery-based, integrated, and problem-
based learning approaches.
Methods of Performance-based Assessment

1. Written-open ended – a written prompt is provided
2. Behavior-based – utilizes direct observations of behaviors in situations or
simulated contexts
3. Interview-based – examinees respond in one-to-one conference setting
with the examiner to demonstrate mastery of the skills
4. Product-based– examinees create a work sample or a product utilizing the
skills/abilities
5. Portfolio-based – collections of works that are systematically gathered to
serve many purposes
PORTFOLIO ASSESSMENT
Portfolio Assessment is also an alternative to pen-and-paper objective test.

It is a purposeful, on going, dynamic, and collaborative process of gathering
multiple indicators of the learner’s growth and development. Portfolio assessment
is also performance-based but more authentic than any performance-based task.
Reasons for Using Portfolio Assessment
Burke (1999) actually recognizes portfolio as another type of assessment and

considered authentic because of the following reasons:
• It tests what is really happening in the classroom.

• It offers multiple indicators of students’ progress.
• It gives the students the responsibility of their own learning.
• It offers opportunities for students to document reflections of their learning.
• It demonstrates what the students know in ways that encompass their
personal learning styles and multiple intelligences.
• It offers teachers new role in the assessment process.
• It allows teachers to reflect on the effectiveness of their instruction.
• It provides teachers freedom of gaining insights into the student’s
development or achievement over a period of time.
Principles Underlying Portfolio Assessment
There are three underlying principles of portfolio assessment: content,

learning, and equity principles.
1. Content principle suggests that portfolios should reflect the subject matter
that is important for the students to learn.
2. Learning principle suggests that portfolios should enable the students to
become active and thoughtful learners.
3. Equity principle explains that portfolios should allow students to demonstrate
their learning styles and multiple intelligences.
Types of Portfolios
Portfolios could come in three types: working, show, or documentary.
1. The working portfolio is a collection of a student’s day-to-day works which
reflect his/her learning.
2. The show portfolio is a collection of a student’s best works.
3. The documentary portfolio is a combination of a working and a show
portfolio.
Portfolio Process
1. Set Goals
2. Collect Evidence
3. Select Evidence
4. Organize Evidence
5. Reflect on Evidence
6. Evaluate Evidence
7. Confer with the Student
8. Exhibit Portfolio
DEVELOPING RUBRICS
Rubric is a measuring instrument used in rating performance-based tasks. It
is the “key to corrections” for assessment tasks designed to measure the attainment
of learning competencies that require demonstration of skills or creation of products
of learning. It offers a set of guidelines or descriptions in scoring different levels of
performance or qualities of products of learning. It can be used in scoring both the
process and the products of learning.
Types of Rubrics
1. Holistic Rubric - It describes the overall quality of a performance or product. In

this rubric, there is only one rating given to the entire work or performance
2. Analytic Rubric - It describes the quality of a performance or product in terms of
the identified dimensions and/or criteria for which are rated independently to give a
better picture of the quality of work or performance.
Important Elements of a Rubric

Whether the format is holistic or analytic, the following information should be
made available in a rubric.
• Competency to be tested – this should be a behavior that requires
either a demonstration or creation of products of learning
• Performance Task – the task should be authentic, feasible, and has
multiple foci
• Evaluative Criteria and their Indicators – these should be made clear
using observable traits
• Performance Levels- these levels could vary in number from 3 or more
• Qualitative and Quantitative descriptions of each performance level
– these descriptions should be observable to be measurable
Guidelines When Developing Rubrics
¾ Identify the important and observable features or criteria of an excellent

performance or quality product.
¾ Clarify the meaning of each trait or criterion and the performance levels.
¾ Describe the gradations of quality product or excellent performance.
¾ Aim for an even number of levels to avoid the central tendency source of
error.
¾ Keep the number of criteria reasonable enough to be observed or judged.
¾ Arrange the criteria in order in which they will likely to be observed.
¾ Determine the weight /points of each criterion and the whole work or
performance in the final grade.
¾ Put the descriptions of a criterion or a performance level on the same page.
¾ Highlight the distinguishing traits of each performance level.
¾ Check if the rubric encompasses all possible traits of a work.
¾ Check again if the objectives of assessment were captured in the rubric.
AFFECTIVE ASSESSMENT
Affective Assessment – this is the process of gathering information about the

outcomes of education that involve disposition or personal feelings such as
attitudes, sense of academic self-confidence or interest in something that
motivationally predisposes a person to act or not to act. It also involves individual’s
choice whether he/she likes to finish a task or how s/he would like to do it.
Affective/Disposition Targets (Anderson & Bourke (2000) cited by Stiggins (

2001)
Target Descriptions
Attitudes It is learned predisposition to respond in a consistent favorable
or unfavorable manner with respect to a given object.
School- Values are beliefs about what should be desired, what is
related important or cherished, and what standards of conduct
values are acceptable
Values influence or guide behavior, interest, attitudes
and satisfactions
Values are enduring. They tend to remain stable over
fairly long periods of time
Values are learned and tend to be of high intensity and
tend to focus on ideas.
The following are values related to academic success:
¾ Belief in the value of education as a foundation for
a productive life
¾ Belief in the benefits of strong effort in school
¾ A strong sense of the need for the ethical
behavior at testing time (no cheating)
¾ The belief that a healthy lifestyle (for ex. No drugs)
underpins academic success
¾ Feeling about the key aspects of their schooling,
that predispose students to behave in
academically productive ways
Academic Is a learned vision that results largely from evaluations of
Self- self by others over time. It is the sum of all evaluative
concept judgments one makes about one’s possibility of success
and/or productivity in an academic context.
Locus of It is the student’s attributions or belief about the reasons for
Control academic success or failure.
Internal – the attributions come from within. “ I
succeeded because I tried hard.”
External – the attributions come from external
contributions or factor. “I sure was lucky to receive
that A!” or “I performed well because I had a good
teacher.”
Self-efficacy Its target is a task, a (school) subject, an instructional
objective and the like. The direction is best captured by “I
can” versus “I can’t”. A “can’t do” attitude lies at the heart of
a concept known as learned helplessness.” The symptoms
include a lack of persistence in the face of failure, negative
affect and negative expectations about the future.
Interests A disposition organized through experience which impels an
individual to seek out particular objects, activities,
understandings, skills or goals for attention or acquisition.
These represent feelings that can range from a high level of
excitement at all at the prospect of engaging in or while
engaged in, some particular activity.
Academic The desire to learn more, the intent to seek out and
Aspirations participate in additional education experiences.
Anxiety The experience of (emotional) tension that results from real
or imagined threats to one’s security.
Range of Dispositions
ATTITUDES
Unfavorable About some person or thing Favorable
VALUES
Unimportant About Idea Important
ACADEMIC SELF-CONCEPT
Negative About self as learning Positive
LOCUS OF CONTROL
External Attributing reasons for circumstances Internal
SELF EFFICACY
Can’t do Likelihood of success Can do
INTERESTS
Disinterested Desirability of activities Interested
ASPIRATIONS
No more Further education More
ANXIETY
Threatened In school, I am Safe
Methods in Assessing Affect/Disposition
Method Description
Questionnaire It asks questions about students’ feelings which are

answered either by selecting from options or by giving brief
or extended written responses.
Performance This is done by doing systematic observations of student

Assessment behavior and/or products with clear criteria in mind.
Personal This is done through interviews either with the students alone
Communication or in groups.
Tools and Techniques in Affective Assessment
1. Interest Inventory- measures learners area of interest

2. Personality Inventory- measures learners traits such as self-concept, social
adjustment, problem solving styles, and other traits.
3. Observation Techniques
3.1 Casual Information Observations- unstructured, unplanned or an
observation without using any instrument
3.2 Observation Guides- structured or with the use of a planned
instrument to record observations
3.3 Clinical Observations – a prolonged process in diagnosing clients in a
controlled clinical setting, which involves the use of sophisticated
techniques and instruments
3.4 Anecdotal Records – a narrative record of observations of a particular

learner behavior during a given situation or event free from
interpretations and conclusions.
3.5 Scales-consists of list of characteristics or behaviors to be observed and
an evaluative scale to indicate the degree to which they occur
3.6. Checklist – a set of traits that an observer has to mark if demonstrated
by a particular learner
4. Self- Reporting Techniques
4.1 Autobiography- enables the learners to describe his/her own life and
experiences
4.2 Self-Expression Essay- seeks to assess the learner’s response to a
particular question or concern usually in a short written essay form
4.3 Self-Description – requires the learner to paint a picture of himself/herself
in words
4.4 Self-Awareness Exercises- designed to help learners become more
aware of their feelings, emotions, and values
4.5. Questionnaire – provides an opportunity to easily collect a great deal of
information that may be useful in further understanding the learner client in
identifying problems as well as opinions, attitudes, and values
4.6 Structured interview – enables the counselor to obtain specific
information and to explore in-depth behavior or responses
5. Group Assessment Techniques
5.1. Sociometric Technique- provides information on social relationships
such as degrees of acceptance, roles and interactions within groups
5.2 Guess Who Technique- best used with relatively well-established groups
in which members are well acquainted with each other.
5.3. Communigram – assesses the frequency of verbal participation of a
learner in a particular group within a given period.
5.4 Social Distance Scales – measures the distance of a learner between
other persons and himself/herself that is usually identified through the reaction to
given statements that compare attitudes of acceptance or rejection of other people.
Formats of Affective Assessment Tools

1. Closed - Item or Forced-choice Instruments – answers are selected
from the given choices
a. Checklist - measures students preferences, hobbies, attitudes,
feelings, beliefs, interests, etc. by marking a set of possible
responses.
b. Scales - these measure the extent or degree of one's response.
Types:
1.) Rating Scale - measures the degree or extent of one's
attitudes feelings, and perception about ideas, objects and
people by marking a point along 3- or 5- point scale.
2.) Semantic Differential Scale - measures the degree of ones

attitudes, feelings, and perception about ideas, objects, and
people by marking a point along 5- or 7- or 11- point scale of
contrasting adjectives at each end.
3.) Likert Scale - measures the degree of ones agreement or
disagreement on positive or negative statements about objects and
people.
c. Alternative- Response - measures students’ preferences, hobbies,
attitudes, feelings, beliefs, interests, etc. by choosing between two
possible responses.
d. Ranking - measures student’s preferences or priorities by ranking a
set of attitudes or objects.
2. Open-Ended Instruments – there are no choices for the answers.
a. Sentence Completion - measures students’ preferences over a
variety of attitudes and allows students to answer by completing an
unfinished statement which may vary in length.
b. Survey - measures the values held by an individual by writing one
or many responses to a given question
c. Essay - allows the students to reveal and clarify their preferences,
hobbies, attitudes, feelings, beliefs, interests and the like by writing
their reaction or opinion on a given question.
Suggestions in Writing Affective Assessment Items
1. Avoid statements that refer to the past rather than to the present.
2. Avoid statements that are factual or capable of being interpreted as
factual.
3. Avoid statements that may be interpreted in more than one way.
4. Avoid statements that are irrelevant to the psychological object under
consideration.
5. Avoid statements that are likely to be endorsed by almost everyone or by
almost no one.
6. Select statements that are believed to cover the entire range of affective
scale of interests.
7. Keep the language of the statements simple, clear and direct.
8. Statements should be short, rarely exceeding 20 words.
9. Each statement should contain only one complete thought.
10. Statements containing universals such as all, always, none, and never
often introduce ambiguity and should be avoided.
11. Words such as only, just, merely, and others of similar nature should be
used with care and moderation in writing statements.
12. Whenever possible, statements should be in the form of simple
sentences rather than in the form of compound or complex sentences.
13. Avoid the use of words that may not be understood by those who are to
be given the completed scale.
14. Avoid the use of double negatives.
Process in Developing & Validating Evaluation Instruments
S
T Formulate Construct a Have the table
A test Table of of
objectives Specifications Write test
R specifications items
T approved by
the experts
Face and content validate the
items
END
Revise test items based on

Administer the experts’ suggestions
instrument to
the target
Are there
users Pilot test revised instrument to
newly
constructed determine the length of time and
NO items? YES level of vocabulary of the learners
Try out the

revised
instrument for Do item analysis Revise instrument if necessary
a test for each item
reliability and Revise test
criterion- items based Df – .2 - .8 (average)
related or on the result .81 – 1 (easy)
Try-out revised instrument to
.19 below (difficult)
construct of item determine the level of difficulty and
validity analysis DS – .3 and above (good)
discrimination of each item
.20 - .29 (moderate)
.19 and below (poor)
Shapes of the Frequency Polygons

1. Normal – bell - shaped curve
2. Positively skewed – most scores are below the mean and there are
extremely high scores. In this shape, the mean is higher than the median
while mode is the lowest among the three measures of central tendency.
3. Negatively skewed – most scores are above the mean and there are
extremely low scores. In this shape, the mean is lower than the median
while the mode is the highest among the three measures of central
tendency..
4. Leptokurtic – highly peaked and the tails are more elevated above the
baseline
5. Mesokurtic – moderately peaked
6. Platykurtic - flattened peak
7. Bimodal Curve – curve with two peaks or mode
8. Polymodal Curve – curve with three or more modes
9. Rectangular Distribution – there is no mode.
Four Types of Measurement Scales
Measurement Characteristics Example

Scale
1. Nominal Groups and labels data Gender ( 1-male, 2-female)
2. Ordinal Ranks data Income (1-low, 2-average, 3-
Distance between points are high)
indefinite
3. Interval Distance between points Test scores and temperature
are equal * a score of zero in a test
No absolute zero point does not mean no knowledge
at all
4. Ratio All of the above except Height, weight
that it has an absolute zero * a zero weight means no
point weight at all
MEASURES OF CENTRAL TENDENCY AND VARIABILITY
Appropriate Statistical Tools

Measure of Central Measure of Variability
Tendency - describes the degree of spread
Assumptions When Used
- describes the or dispersion of a set of data
representative value
of a set of data
• When the frequency
distribution is regular/ Mean - the Standard Deviation - the root-
symmetrical/normal arithmetic average mean-square of the deviations
from the mean..
• Usually used when the data
are numeric (interval or Mean = ∑X/N
ratio)
• When the frequency Median - the Quartile Deviation – the
distribution is irregular/ middle score in a average deviation of the 1st
skewed group of scores and 3rd quartiles from the
• Usually used when the data that are ranked median
are ordinal Median –( N + 1)/2 Q3 - Q1
QD =
2
• When the distribution of Mode – the score Range – the difference

scores is normal and quick that occurs most between the highest and
answer is needed frequently lowest score in a set of
• Usually used when the data observations
are nominal R = Highest Score-Lowest
Score
STANDARD SCORES
• Indicate the pupil’s relative position by showing how far his raw score is above or
below average
• Express the pupil’s performance in terms of standard unit from the mean.
• Represented by the normal probability curve or what is commonly called the normal
curve
• Used to have a common unit to compare raw scores from different tests
1. PERCENTILE
• tells the percentage of examinees that lies below one’s score.
Example: P85 = 70 This means the person who scored 70 is higher than
85% of the examinees.
2. Z-SCORES
• tells the number of standard deviations equivalent to a given raw score
3. T – SCORES
it refers to any set of normally distributed standard
deviation score that has a mean of 50 and a standard
deviation of 10.
¾ Guide for the Interpretation of Standard Scores
The equivalence of z-scores, t-scores and their relation to

percentiles and to the normal curve is shown below.
2% 2%
14% 34% 34% 14%
SD’s -4 -3 -2 -1 0 +1 +2 +3 +4
Z-Scores -4 -3 -2 -1 0 +1 +2 +3 +4
T-Scores 10 20 30 40 50 60 70 80 90
Percentile
1 2 16 50 84 98 99.9
s
Interpretation of the Pearson r correlation value
1 Perfect positive correlation

High positive correlation
0.5 Positive correlation
Low positive correlation
0 Zero correlation
Low negative correlation
-0.5 Negative correlation
High negative correlation
-1 Perfect negative correlation
Grading and Reporting

A. Marking/Grading is the process of assigning value to a performance
B. Marks/Grades/Ratings could be in percent, letters, numbers, or in descriptive
expressions. Any symbol can be used provided that it has a uniform
meaning to all concerned.
It could represent –
• how a student is performing in relation to other students (Norm-
Referenced Grading)
• the extent to which a student has mastered a particular body of
knowledge (Criterion-Referenced Grading)
• how a student is performing in relation to a teacher's judgment of his or
her potential. (Grading in Relation to Teacher's Judgment)
Purposes of Grades
Purpose Description
Certification gives assurance that a student has mastered a specific content or
achieved a certain level of accomplishment
Selection provides basis in identifying or grouping students for certain
educational paths or programs.
Direction provides information for diagnosis and planning
Motivation emphasizes specific material or skills to be learned and helping
students to understand and improve their performance.
Different Grading Systems

System Description
Criterion - grading based on fixed or absolute standards where grade is
referenced assigned based on how a student has met the criteria or the well-
grading defined objectives of a course that were spelled out in advance.
It is then up to the student to earn the grade he or she wants to
receive regardless of how other students in the class have
performed. This is done by transmuting test scores into marks or
ratings.
Norm- grading based on relative standards where a student's grade

referenced reflects his or her level of achievement relative to the performance
grading of other students in the class. In this system the grade is
assigned based on the average of test scores.
Averaging done when the teacher computes the final grade of the students
Grading by getting the mean or the arithmetic average of all the partial
System grades.
Cumulative done when the final grade is based on the previous grade and the
Grading computed grade for the present performance of the students. Most
System schools that practice this grading get 30% of the grade of the
student in the previous quarter or level and adds this to 70%
percent of the computed grade of the student for the present
quarter or level to report the final grade of the student.
Point or whereby the teacher identifies points or percentages for various

Percentage tests and class activities depending on their importance. The total
Grading of these points will be the bases for the grade assigned to the
System student.
Contract Grading where each student agrees to work for a particular grade
Grading according to agreed-upon standards.
System
Guidelines in Grading Students

1. Explain your grading system to the students early in the course and
remind them of the grading policies regularly.
2. Base grades on a predetermined and reasonable set of standards.
3. Base your grades on as much objective evidence as possible
4. Base grades on the student's attitude as well as achievement, especially
at the elementary and high school level.
5. Base grades on the student's relative standing compared to classmates
6. Base grades on a variety of sources.
7. Become familiar with the grading policy of your school and with your
colleagues' standards.
8. When failing a student, closely follow school procedures
9. Guard against bias in grading
10. Keep pupils informed of their standing in the class.
Conducting Teacher-Parent Conference
1. Make plans for the conference

2. Begin the conference in a positive manner
3. Present the student’s strong points before describing the areas needing
improvement
4. Encourage parents to participate and share information.
5. Plan a course of action cooperatively.
6. End the conference with a positive comment.
7. Use good human relation skills during the conference.
References:
1. Ardovinio, J., Hollingsworth, J., & Ybarra, S. (2000). Multiple measures. Calfornia: Corwin Press Inc.
2. Campbell,D.M., Melenyzer, B.J., Nettles, D.H., Wyman, R.M. (2003). Porfolio and performance
assessment in teacher education. Boston: Allyn and Bacon.
3. Gredler,M.G. (1999). Classroom assessment and learning. Newyork: Longman.
4. Kubiszyn, T. & Borich G. (2000). Educational Testing and Measurement Classroom Assessment and
Practice. New York: John Wiley & Sons, Inc.
th
5. Linn, R. (2000). Measurement and Assessment in Teaching (8 Ed). Prentice Hall
6. McMillan, J.H. (1997). Classroom Assessment Principles and Practice for Effective Instruction, Boston:
Allyn and Bacon
7. Popham, J. (1999). Classroom Assessment what teachers need to know (2nd ed). Boston: Allyn and
Bacon.
8. Schipper, B. & Rossi, J. (1997). Portfolios in the classroom, tools for learning and instruction: York,
Maine: Stenhouse Publishers.
9. Stiggins, R.J. (2001). Student-involved classroom assessment. New Jersey, Merill Prentice Hall.
¾ Ward, A.W.& Ward, M.M. (1999). Assessment in the classroom. Belmont, California: Wadswort

2012 MUBALAGTAS Assessment of Learning - CONTENT UPDATE - 1 PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

2012 MUBALAGTAS Assessment of Learning - CONTENT UPDATE - 1 PDF

Hochgeladen von

Copyright:

Verfügbare Formate

PROFESSIONAL READINESS FOR THE

LICENSURE EXAMINATION FOR TEACHERS (LET) 2011

DR. MARILYN UBINA-BALAGTAS

LET COMPETENCIES TARGETED

PRINCIPLES OF HIGH QUALITY CLASSROOM ASSESSMENT

1. Assessment should be based on clear and appropriate learning targets.

8. Assessment should be authentic.

PURPOSES OF CLASSROOM ASSESSMENT

1. Assessment FOR Learning – this includes three types of assessment done

b. Formative – done during instruction

c. Diagnostic – done during instruction

2. Assessment OF Learning – this is done after instruction. This is usually

Learning/Assessment Targets (Mc Millan, 2007; Stiggins, 2007)

Reasoning student ability to use knowledge to reason and solve

Learning Targets and their Appropriate Assessment Methods

Note: Higher numbers indicate better matches ( e.g. 5 = high, 1 = low)

Different Methods of Assessment

Objective Objective Performance Oral Observatio Self-

Short Multiple Restricted Presentations Oral Informal Attitude

Mode Description Examples Advantages Disadvantages

Types of Tests According to Format

1. Selective Test - provides choices for the answer.

General Suggestions in Writing Test

D. Supply Objective Test

Criteria to Consider when Constructing Good Test Items

A. Validity - is the degree to which the test measures what is intended to

Factors Influencing the Validity of the Tests in General

Ways in Establishing Validity

2. Equivalent Measure of Give parallel forms of tests with Pearson r

4. Split Half Measure of Give a test once. Score equivalent Pearson r

C. Administrability - the test should be administered with ease, clarity and

Reasons for Using Performance-Based Assessment

Methods of Performance-based Assessment

Portfolio Assessment is also an alternative to pen-and-paper objective test.

Reasons for Using Portfolio Assessment

Burke (1999) actually recognizes portfolio as another type of assessment and

• It tests what is really happening in the classroom.

Principles Underlying Portfolio Assessment

There are three underlying principles of portfolio assessment: content,

1. Holistic Rubric - It describes the overall quality of a performance or product. In

Important Elements of a Rubric

Guidelines When Developing Rubrics

¾ Identify the important and observable features or criteria of an excellent

Affective Assessment – this is the process of gathering information about the

Affective/Disposition Targets (Anderson & Bourke (2000) cited by Stiggins (

Methods in Assessing Affect/Disposition

Questionnaire It asks questions about students’ feelings which are

Performance This is done by doing systematic observations of student

Tools and Techniques in Affective Assessment

1. Interest Inventory- measures learners area of interest

3.4 Anecdotal Records – a narrative record of observations of a particular

Formats of Affective Assessment Tools

2.) Semantic Differential Scale - measures the degree of ones

Suggestions in Writing Affective Assessment Items

Process in Developing & Validating Evaluation Instruments

Revise test items based on

Try out the

Shapes of the Frequency Polygons

Four Types of Measurement Scales

Measurement Characteristics Example

MEASURES OF CENTRAL TENDENCY AND VARIABILITY

Appropriate Statistical Tools

• When the distribution of Mode – the score Range – the difference