Beruflich Dokumente
Kultur Dokumente
a. b.
Valid -----refers to the extent to which measure what is purpose to measure. State that if the test item is congruent to the behavior to be tested it is then valid.
Types of Evidence
CONTENT VALIDITYCRITERION-RELATED VALIDITYCONSTRUCT-RELATED VALIDITY -- refers to how well a performance on a particular set of task can be explained by some PSYCHOLOGICAL CONSTRUCT or TRAITS.
1. PREDICTIVE VALIDITY --- involves the use of criterion and a predictor. Example correlating the results of college entrance test and student GWA at some future time (predictor= CET; criterion= GWA)
2. CONCURRENT VALIDITY -- criterion are already available in which CET is correlated with some available criterion (predictor= GWA; criterion= 4th year high school grade)
CRITICAL CONSTRUCT - predictors, conclusions, assumptions, inference, interpretations and relevance of evidence
Test Takers
Test Administrations
Test Scoring
DEGREES OF RLATIONSHIP BETWEEN TWO SETS OF SCORE +1.00----PERFECT POSITVE RELATIONSHIP (the better)more from the upper group got the test correctly. 0.00---- NO RELATIONSHIP -1.00----PERFECR NEGATIVE RELATIONSHIP more from the lower group got the test correctly.
DISCRIMINANT VALIDITY---DIFFERENT TRAITS CONSTRUCT --- SCORE OF CRITICAL THINKING TEST ARE CORRELATED WITH THOSE OF ATTITUDES TOWARDS MOVIES
PARALLEL/ALTERNATE FORMS METHOD --used two different versions of the same test, administered to the same group close together in time. It used form A or B and can be given on the same day or the next day. The difference of the two is how they worded or written, it should measure the same skills and errors are significantly controlled TEST-RETEST WITH ALTERNATE FORMS METHOD --administering the two version of the same test on two different occasions. Time interval may be short(2 weeks)(longer for 6 months). Takes into account all possible sources of errors. It is the most useful indicates variation of a test score over a period of time. INTERNAL CONSISTENCY METHOD -- employ only one test administration of the same test given to the same group on individual. DIFERENT METHODS 1. SPILT-HALF /ODD-EVEN METHODscoring odd items, scoring even items 2. KUDER RICHARDSON FORMULA 20two sets of score (odd and even) are correlated using PRODUCT MOMENT CORRELATION COEFFICIENT FORMULA 3. TO TEST THE RELIABILITY OF THE WHOLE TEST (USE SPEARMAN-BROWN PROPHECY FORMULA ) 4. PEARSON r USED TO COMPUTE INTERMNAL CONSISTENCY OF A CERTAIN TEST USED IN SPLIT-HALF METHOD
Reliability coefficient is high then it is said to be homogenous. Consistency of the test scores determined over different parts of the entire test.. RELIABILITY ESTIMATE MEASURE TEST-RETEST ALTERNATE FORMS
ITSELF,
WHAT TO
: ;
: :
TEST ADMIN,
NOTE: a reliability coefficient of +.86 of a test measure that 86/100 of the obtained score of an individual is true score and 14/100 can be attributed to errors of measurements.
Is the extent to which a test item differentiate good performer to poor performer
Index of discrimination
Index of difficulty
METHOD TO EMPLOY IN ITEM ANALYSIS -USING THE UPPER AND LOWER INDEX METHOD 27/100
1. 2. 3.
4. 5.
6.
7.
8.
After scoring the test, arrange from lowest to highest. Segregate the top and bottom 27/100 of the paper. Tally the correct answers to each item by each student in the upper 27/100 group. Repeat step three, considering the lower 27/100. Get the percentage of the upper group that obtained the correct answer use U. repeat step 5, considering lower group. Used L. Get the average percent of U and L. Get the difference between U and L.
L/U = NO. OF PUPILS GOT ITEM CORRECT NL/NU = NO. OF PUPIL IN THE LOWER GROUP OVER UPPER GROUP
The higher the difficulty index the easiest the item is.
-0.59 - -0.20
-0.19 0.20
Not discriminating
Moderate discriminating
0.21 0.60
0.61 1.00
Discriminating
Very discriminating
Formula: Ds = {((U/NU)-(L/NL)}
FAIR OR REVISED
-UNACCEPTED DIFFICULTY OR DISCRIMINATION INDEX
POOR OR DISCARDED
-BOTH DIFFICULTY AND DISCRIMINATION INDEX ARE UNACCEPTABLE. THEN THE ITEM NEED TO BE DISCARDED RIGHT AWAY
ACTION
VERY DIFFICULT
DISCARD DISCARD REVISE RETAIN REVISE MAY NEED REVISION ACCEPT DISCARD
DIFFICULT
MODERATE DIFFICULT
NOT DISCRIMINATING
EASY
MODERATELY DISCRIMINATING
DISCRIMINATING QUESTIONABLE
N.R.
N.R SEE EXAMPLE DISCARD
VERY EASY
TRADITIONAL ASSESSMENT
Discrete Point(Single Attribute Assessment) -- example Language assessment in the form of Multiple choices, matching type, true or false, or short answer Charles Spearman(1904)-Two Factor Theory --general Factor Or G-factor and postulates specific or S-factor. Example of tests with g-factor are Raven Progressive Matrices and Catres Culture Fair Intelligence test Integrative or Global Assessment(Multiple Trait Assessment) --measure more than one point or objective at a time, and often pragmatic. Example is writing composition Cloze Test --innovative method for testing wherein words are deleted from a passage. The most common practice is to delete every 5th word. The acceptable range for readability of certain reading materials is between 30-50 percent. C-Test -- second half of every word is deleted., leaving the first and last word intact, and commonly contains 100 words Dictation Test -- primarily a test for listening, and spelling. It is a test use to measure the ability to use capital letters, punctuation marks, spell words correctly and write legibly and neatly. ADMINISTERING DICTATION TEST Read each word once or twice as student listen, ask student to write the word. Read the word again for confirmation. Read each sentence slowly once or write then at normal speed once before students are asked to write. And do not read the word while students are writing Oral Interview--kind of integrative assessment. It is a collecting information through face-to-face between the interviewee and interviewer. The interviewee is not at liberty to modify or make a follow up question. The question should be prepared before hand and objective should be taking in consideration
1. arrange the scores from highest to lowest, particular scores may be written as many times as it may occurs. 2. put a serial number opposite to each. 1,2,3,4,,.. 3. average the rank of each scores appearing more than one. Example 45,45,45appear three times and rank as 7, 8, and 9, then add = 24/3 = then they will be rank 8.
GRAPHING OF DATA
6 4 2 0 Series 1 Series 2 Series 3 6 4
1. 2. 3.
Series 1
Series 2
2 0
- using deviation am- assume mean d deviation --- summation of frequency times devation.
The MEDIAN is defined by -- the middle most score in the distribution. It divides the distribution in half or 50 % of the scores is found above the median, and the other 50 % lies below the median . For ungrouped data 1. Arrange the scores from highest to lowest or vise versa. 2. If odd numbers, median is the middle most number in the distribution.
ll- lowest limit of N/2 N- no. of cses Cf- cummulative frequency f- frequency where the measure lies i- nterval
The MODE is defined by -- The most frequent, extremes, and repeated numbers. It is not affected if one number is changed less then or greater than
For ungrouped data 1. The mode for ungrouped data is the number that occur most.
. Normal distribution
>
>
The graph shows that the number of student who got good grades are relatively lower than those who got lower grades..
Negatively
skewed distribution
1. THERE ARE MORE HIGH SCORES THAN LOWER SCORE. 2. IT SHOWS THAT TEST IS VERY EASY, THUS EVEN THE LOW PERFORMER STUDENT S GOT GOOD GRADE FORMED AN ASYMMETRICAL DISTRIBUTION MODE>MEDIAN>MEAN 2. INVERSE OF POSITIVELY DISTRIBUTION
>
>
The graph shows that the number of student who got high grades are relatively more than those who got lower grades..
Forms of Assessment
1. TRADITIONAL ASSESSMENT
- EXAMPLE MULTIPLE CHOICE, MATCHING TYPE, TRUE OR FALSE COMPLETION TEST 2. PERFORMANCE ASSESSMENT -ENGAGE IN COMPLEX TASK, CREATION OF PRODUCT EX. DANCE STEP, DEMONSTRATION 3. PORTFOLIO ASSESSMENT -ON GOING EVALUATION, INVOLVES GATHERING OR COLLECTING MANY DIFFERENT STUDENTS PROGRESS INDICATORS 4. AUTHENTIC ASSESSMENT -REAL LIFE CRITERIA USE OF JUDGMENTS
THANK YOU!