Beruflich Dokumente
Kultur Dokumente
3. Evaluation defined Qualitative aspect of determining the outcomes of learning. Process of ranking
with respect to attributes or trait Appraising the extent of learning Judging effectiveness of educ.
experience Interpreting and analyzing changes in behavior Describing accurately quantity and quality
of thing Summing up results of measurement or tests giving meaning based on value judgments
Systematic process of determining the extent to which instructional objectives are achieved
Considering evidence in the light of value standard and in terms of particular situations and goals which
the group of individuals are striving to attain.
6. FUNCTIONS OF MEASUREMENTS 1.b) Secondary (auxiliary functions for effective teaching and
learning) - to help in study habits formation - to develop the effort-making capacity of students - to
serve as aid for guidance, counselling, and prognosis
10. Principles of EvaluationEvaluation should be1. Based on clearly stated objectives2. Comprehensive3.
Cooperative4. Used Judiciously5. Continuous and integral part of the teaching learning process
11. Types of Evaluation used in classroom instruction1. Diagnostic Evaluation detects pupils learning
difficulties which somehow are not revealed by formative tests. It is more comprehensive and specific.2.
Formative Evaluation It provides feedback regarding the students performance in attaining
instructional objectives. It identifies learning errors that neded to be corrected and it provides
information to make instruction more effective.
12. Types of Evaluation used in classroom instruction3. Placement Evaluation It defines students entry
behaviors. It determines knowledge and skills he possesses which are necessary at the beginning of
instruction.4. Summative Evaluation It determines the extent to which objectives of instruction have
been attained and is used for assigning grades/marks and to provide feedback to students.
13. Qualities of a Good Measuring Instrument1. VALIDITY Content, concurrent, predictive, construct2.
RELIABILITY adequacy, objectivity, testing condition, test administration procedures3. USABILITY
(practicality) ease in administration, scoring, interpretation and application, low cost, proper mechanical
make up
14. VALIDITYContent validity face validity or logically validity used in evaluating achievement
testConcurrent validity test agrees with or correlates with a criterion (ex. entrance
examination)Predictive validity degree of accuracy of how test predicts the level of performance in
activity which it intends to foretellConstruct validity agreement of the test with a theoretical construct
or trait (ex. IQ)
15. Lets have a problem situation:A fisherman who captures on piece of yellow fin tuna weighs it and it
measures 100 kilograms. As he meets a friend after friend, he tells that the weight of the fish he caught
is 130 kilo grams. In statistical sense, the story is reliable for it is consis- tent (why is it consistent), but
the truth- fulness of the fishermans story is not established, hence it is not valid but reliable.LESSON: A
test can be reliable without being valid but a valid test is reliable.
16. RELIABILITYMethods of estimating reliability1. Test-retest Method (uses Spearman rank correlation
coefficient)2. Parallel forms / alternate forms ( paired observations are correlated)3. Split-half method
(odd-even halves and computed using Spearman Brown formula)4. Internal-consistency method (Kuder-
Richardson formula 20)5. Scorer reliability method (two examiners independently score a set of test
papers then correlate their scores)
17. TESTSClassification of Tests according to manner of response: Oral and Written according to method
of preparation: Subjective/essay and Objective according to nature of answer Intelligence test,
Personality test, Aptitude test, Prognostic test, Diagnostic test, Achievement test, Preference test,
Accomplishment test, Scale test, Speed test, Power test, Standardized test, Teacher made test,
Placement test
18. Classification of Measuring Instrument1. Standard Tests a) Psychological test Intelligence test,
Aptitude test, Personality (Rating scale) test, Vocational and Professional Interest Inventory b)
Educational Test2. Teacher made test Planning, Preparing, Reproducing, Administering, Scoring,
Evaluating, Interpreting
19. Evaluating with the use of ITEM Analysis1. Effectiveness of distractors A good distractor attracts the
student in the lower group than in the upper group2. Index of discrimination The index of discrimination
may be positive if more students in the high group got the correct answer and negative if more students
in the low group got the correct answer.3. Index of difficulty Difficulty refers to the of getting the right
answer of each item. The smaller the percentage, the more difficult the item is.
20. Practice Task in Item AnalysisTest Item no. 5Options 1 2 3* 4 5Upper 27% 2 3 7 2 0 (14)Lower 27% 4
2 3 5 0 (14)*correct answer
21. Types of Teacher Made Tests1. Essay type Advantages: easy to construct, economical, minimize
guessing, develops critical thinking, minimize cheating and memorizing, develops good study habits2.
Objective type a) Recall type simple recall, completion type b) Recognition type alternate response
(true/false, yes/no, right/wrong, agree/disagree); Multiple choice (stem-and-options variety, setting-
and-options variety, group-term variety, structured response variety, contained-option variety) c)
Matching type d) Rearrangement type e) Analogy type purpose, cause and effect, synonym
relationship, antonym relationship, numerical relationship f) Identification type
22. Multiple Choice Test (Recognition type)1.stem-and-options variety : the stem serves as the
problem2.setting-and-options variety : the optional respon- ses are dependent upon a setting or
foundation of some sort, i.e. graphical representation3.group-term variety : consist of group of words or
terms in which one does not belong to the group4.structured response variety: makes use of
structured response which are commonly use in classroom testing for natural science
subjects5.contained-option variety: designed to identify errors in a word, phrase, sentence or
paragraph.
24. Table of Specifications (TOS)It is the teachers blue print.It determines the content validity of the
tests.It is one- way table that relates the instructional objectives to the course contentIt makes use of
Blooms Taxonomy in determining the Levels of Cognitive Domain
25. TOS Matrix Time Levels of Cognitive Abilities No. of Test %Topic spent K C A HA Items Step 1 Step 2
Step 9 Step 6 Step 4 Identify determine compute the number of items per determine Find the the time
topic per level the number the %topics to spent in of test items time be hours for per topic spent Step
10 tested each topic forfrom the Determine the test item eachsyllabus placement and indicate it in the
topic cell per topic per level Step 3 Step 7 Allocate % marks for the Step 5Total find the different levels
determine 100% total time Step 8 Compute number of items the total test spent per levels items
26. Criterion and Norm Reference TestsCriterion-Reference Tests It serves to identify on what extent the
individuals performance has met in a given criterion. (ex. A level of 75% score in all the test items could
be considered a satisfactory performance) It points out what a learner can do, not how he compares
with others It identifies weak and strong points in an individuals performance It tends to focus on sub
skills, shorter, mastery learning It could be both diagnostic and prognostic in nature.
27. Criterion and Norm Reference TestsNorm-Referenced TestsIt compares a students performance
with the performance of other students in the classIt uses the normal curve in distributing grades of
students by placing them either above or below the mean.The teachers main concern is the variability
of the score.The more variable the score is the better because it can determine how individual differs
from the other.Uses percentiles and standard scores.It tends to be of average difficulty.
28. Measures of Central Tendency Mean, Median, Mode Measures of Variability Range, Quartile
Deviation, Standard Deviation Point Measures Quartiles, Deciles, Percentiles
29. Measures of Central TendencyMODE the crude or inspectional average measure. It is most
frequently occurring score. It is the poorest measure of central tendency.Advantage: Mode is always a
real value since it does not fall on zero. It is simple to approximate by observation for small cases. It does
not necessitate arrangement of values.Disadvantage: It is not rigidly defined and is inapplicable to
irregular distributionWhat is the mode of these scores? 75,60,78, 75 76 75 88 75 81 75
30. Measures of Central TendencyMEDIAN The scores that divides the distribution into halves. It is
sometimes called the counting average.Advantage: It is the best measure when the distribution is
irregular or skewed. It can be located in an open-ended distribution or when the data is incomplete (ex.
80% of the cases is reported)Disadvantage: It necessitates arranging of items according to size before it
can be computedWhat is the median? 75,60,78, 75 76 75 88 75 81 75
31. Measures of Central TendencyMEAN The most widely used and familiar average. The most reliable
and the most stable of all measures of central tendency.Advantage: It is the best measure for regular
distribution.Disadvantage: It is affected by extreme valuesWhat is the mean? 75,60,78, 75 76 75 88 75
81 75
32. Point Measures: Quartiles point measures where the distribution is divided into four equal parts. Q1
: N/4 or the 25% of distribution Q2 : N/2 or the 50% of distribution ( this is the same as the median of
the distribution) Q3 : 3N/4 or the 75% of distribution
33. Point Measures: Deciles point measures where the distribution is divided into 10 equal groups. D1 :
N/10 or the 10% of the distribution D2 : N/20 or the 20% of the distribution D3 : N/30 or the 30% of the
distribution D4 : N/40 or the 40% of the distribution D5 : N/50 or the 50% of the distribution D. D9 :
N/90 or the 90% of the distribution
34. Point Measures: Percentiles point measures where the distribution is divided into 100 equal
groupsP1 : N/1 or the 1% of the distributionP10 : N/10 or the 10% of the distributionP25 : N/25 or the
25% of the distributionP50 : N/50 or the 50% of the distributionP75 : N/75 or the 75% of the
distributionP90 : N/90 or the 90% of the distributionP99 : N/99 or the 99% of the distribution
35. Measures of Variability or Scatter1. RANGE R = highest score lowest score2. Quartile Deviation QD
= (Q3 Q1)It is known as semi inter quartile rangeIt is often paired with median
36. Measures of Variability or Scatter: STANDARD DEVIATION It is the most important and best
measure of variability of test scores. A small standard deviation means that the group has small
variability or relatively homogeneous. It is used with mean.
37. TABLE 1Class limits Midpoints (M) Frequency (f) f.M Cum f < 45 47 46 2 45(2) 30 42 44 43 3 43(3)
28 39 41 40 1 40(1) 25 36 38 37 2 37(2) 24 33 35 34 4 34(4) 22 30 32 31 4 31(4) 18 27 29 28 1
28(1) 14 24 26 25 3 25(3) 13 21 23 22 2 22(2) 10 18 20 19 3 19(3) 8 15 17 16 4 16(4) 5 12 14 13
1 13(1) 1 TOTAL 30
38. MEANMean = fM ffM total of the product of the frequency (f) and midpoint (M)f total of the
frequencies
39. MEDIAN Median = L + c [N/2 - cum f<] fcL lowest real limit of the median classcum f< sum of
cum f less than up to but below median classfc frequency of the median classc class intervalN
number of cases
40. MODEMODE = LMo + c/2 [ f1 f2 ] [2fo f2 f1]LMo lower limit of the modal classc class
intervalf1 frequency of class after modal classf2 frequency of class before modal classf0 frequency
of modal class
Throughout my years of teaching undergraduate courses, and to some extent, graduate courses, I was
continuously reminded each semester that many of my students who had taken the requisite course in
"educational tests and measurements" or a course with a similar title as part of their professional
preparation, often had confusing ideas about fundamental differences in terms such as measurement,
assessment and evaluation as they are used in education. When I asked the question, "what is the
difference between assessment and evaluation," I usually got a lot of blank stares. Yet, it seems that
understanding the differences between measurement, assessment, and evaluation is fundamental to the
knowledge base of professional teachers and effective teaching. Such understanding is also, or at the
very least should be, a core component of the curricula implemented in universities and colleges required
in the education of future teachers. Understanding the properties, purposes, similarities and differences
between educational measurement, assessment and evaluation is a fundamental component of the
knowledge base of professional teachers.
In many places on the ADPRIMA website the phrase, "Anything not understood in more than one way is
not understood at all" appears after some explanation or body of information. That phrase is, in my
opinion, a fundamental idea of what should be a cornerstone of all teacher education. Students often
struggle with describing or explaining what it means to "understand" something that they say they
understand. I believe that in courses on on the subject of educational tests and measurements it is often
that case that "understanding" is inferred from responses on multiple-choice tests or solving statistical
problems. A semester later, when questioned about very fundamental ideas in statistics, measurement,
assessment and evaluation, the students in my courses seemingly forgot most, if not all of what they
"learned."
Measurement, assessment, and evaluation mean very different things, and yet most of my students were
unable to adequately explain the differences. So, in keeping with the ADPRIMA approach to explaining
things in as straightforward and meaningful a way as possible, here is what I think are useful descriptions
of these three fundamental terms. These are personal opinions, but they have worked for me for many
years. They have operational utility, and therefore may also be useful for your purposes.
Measurement refers to the process by which the attributes or dimensions of some physical object are
determined. One exception seems to be in the use of the word measure in determining the IQ of a
person. The phrase, "this test measures IQ" is commonly used. Measuring such things as attitudes or
preferences also applies. However, when we measure, we generally use some standard instrument to
determine how big, tall, heavy, voluminous, hot, cold, fast, or straight something actually is. Standard
instruments refer to physical devices such as rulers, scales, thermometers, pressure gauges, etc. We
measure to obtain information about what is. Such information may or may not be useful, depending on
the accuracy of the instruments we use, and our skill at using them. There are few such instruments in
the social sciences that approach the validity and reliability of say a 12" ruler. We measure how big a
classroom is in terms of square feet, we measure the temperature of the room by using a thermometer,
and we use an Ohm meter to determine the voltage, amperage, and resistance in a circuit. In all of these
examples, we are not assessing anything; we are simply collecting information relative to some
established rule or standard. Assessment is therefore quite different from measurement, and has uses
that suggest very different purposes. When used in a learning objective, the definition provided on the
ADPRIMA for the behavioral verb measure is: To apply a standard scale or measuring device to an
object, series of objects, events, or conditions, according to practices accepted by those who are skilled
in the use of the device or scale. An important point in the definition is that the person be skilled in the use
of the device or scale. For example, a person who has in his or her possession a working Ohm meter, but
does not know how to use it properly, could apply it to an electrical circuit but the obtained results would
mean little or nothing in terms of useful information.
Click here for a brief explanation of the different types of measurement scales. The information will give
you a little more context for the preceding section.
Assessment is a process by which information is obtained relative to some known objective or goal.
Assessment is a broad term that includes testing. A test is a special form of assessment. Tests are
assessments made under contrived circumstances especially so that they may be administered. In other
words, all tests are assessments, but not all assessments are tests. We test at the end of a lesson or unit.
We assess progress at the end of a school year through testing, and we assess verbal and quantitative
skills through such instruments as the SAT and GRE. Whether implicit or explicit, assessment is most
usefully connected to some goal or objective for which the assessment is designed. A test or assessment
yields information relative to an objective or goal. In that sense, we test or assess to determine whether or
not an objective or goal has been obtained. Assessment of skill attainment is rather straightforward.
Either the skill exists at some acceptable level or it doesnt. Skills are readily demonstrable. Assessment
of understanding is much more difficult and complex. Skills can be practiced; understandings cannot. We
can assess a persons knowledge in a variety of ways, but there is always a leap, an inference that we
make about what a person does in relation to what it signifies about what he knows. In the section on this
site on behavioral verbs, to assess means To stipulate the conditions by which the behavior specified in
an objective may be ascertained. Such stipulations are usually in the form of written descriptions.
Evaluation is perhaps the most complex and least understood of the terms. Inherent in the idea of
evaluation is "value." When we evaluate, what we are doing is engaging in some process that is designed
to provide information that will help us make a judgment about a given situation. Generally, any
evaluation process requires information about the situation in question. A situation is an umbrella term
that takes into account such ideas as objectives, goals, standards, procedures, and so on. When we
evaluate, we are saying that the process will yield information regarding the worthiness, appropriateness,
goodness, validity, legality, etc., of something for which a reliable measurement or assessment has been
made. For example, I often ask my students if they wanted to determine the temperature of the classroom
they would need to get a thermometer and take several readings at different spots, and perhaps average
the readings. That is simple measuring. The average temperature tells us nothing about whether or not it
is appropriate for learning. In order to do that, students would have to be polled in some reliable and valid
way. That polling process is what evaluation is all about. A classroom average temperature of 75 degrees
is simply information. It is the context of the temperature for a particular purpose that provides the criteria
for evaluation. A temperature of 75 degrees may not be very good for some students, while for others, it
is ideal for learning. We evaluate every day. Teachers, in particular, are constantly evaluating students,
and such evaluations are usually done in the context of comparisons between what was intended
(learning, progress, behavior) and what was obtained. When used in a learning objective, the definition
provided on the ADPRIMA site for the behavioral verb evaluate is: To classify objects, situations, people,
conditions, etc., according to defined criteria of quality. Indication of quality must be given in the defined
criteria of each class category. Evaluation differs from general classification only in this respect.
To sum up, we measure distance, we assess learning, and we evaluate results in terms of some set of
criteria. These three terms are certainly share some common attributes, but it is useful to think of them as
separate but connected ideas and processes.
Functions of measurement
2. B.To classify or select students for special purposes. 1.Grouping of students into classes or sections
based on ability for instructional purposes is an old educational practice. 2. Test are used to discover the
extremely bright and talented students, the very dull or handicapped ones, or those with special talents.
3.For purposes of granting scholarships the government, some schools, colleges,universities, and private
social and civic organizations give competitive examinations for the purpose of selecting recipients of
such scholarships. 4. For granting honors, the results of measurement serve as basis for selection of
honor students. 5. Tests are also given for emotional, educational, and vocational guidance and
counseling purposes. C. To determine the efficiency of teachers ,the effectiveness of their methods,
techniques and strategies, their strengths, weaknesses, and needs.Taking all other things equal,
students under and efficient teacher score better in a test than students under an inefficient teacher. D.
To determine the standard of instruction of a school, district, division, region, or the educational system
as a whole.-This is usually done through survey tests the results of which are checked against the
standard or policy set by higher education authorities. E. To serve as basis or guide for curriculum
making and development.-By means of tests the mental age of pupils who ought to be in certain grade
maybe established. F. To serve as guide for administrators and supervisors in making their educational
plans and programs for their schools.-Test results will reveal the strengths and weaknesses of school
programs. G. To set up norms of performance.-This is done by standardizing Psychological as well as
educational test. H. To keep parents informed of the progress made by their children in school.-This is
for keeping good public relations between the schools and the community. I. To serve as basis for
research.-The results of measurement are very rich sources of problems and topics for research.
SUMMARY FUNCTIONS OF MEASUREMENT: 1. INSTRUCTIONAL a)Principal (basic purpose) - to
determine what knowledge, skills, abilities, habits and attitudes have been acquired - to determine what
progress or extent of learning attained - to determine strengths, weaknesses, difficulties and needs of
students
3. b) Secondary (auxiliary functions for effective teaching and learning) - to help in study habits
formation - to develop the effort-making capacity of students - to serve as aid for guidance,counseling ,
and prognosis. 2.ADMINISTRATIVE and SUPERVISORY - to maintain standards - to classify or select for
special purposes - to determine teachers efficiency, effectiveness of methods, strategies used
(strengths, weaknesses, needs); standards of instruction - to serve as basis or guide for curriculum
making and developing. Administrative and supervisory Function - to serve as guide in educational
planning of administrators and supervisors - to set up norms of performance - to inform parents of their
childrens progress in school - to serve as basis for research RESULTS - of the Study of Functions of
measurement -Through the different tests that was conducted, serve as the basis for a consistent
research study. -Guide for the school administratorsand supervisors for educational planning,
curriculum making and developing.
What is Measurement?
process by which information about the attributes or characteristics of things are achieved and
differentiated. It implies quantitative value which can be placed in a physical property or stating an
outcome of instructions. Quantification is necessary to make the determination or differentiation of
the attribute less ambiguous and subjective (Oriondo, 1984).
is an instrument or device use to determine individuals achievement, personality, attitudes, and
among others anything that that can be express quantitatively (Calmorin, 1994).