Chapter 6: Standardized Measurement and Assessment
Answers to Review Questions
6.1. What is measurement? Measurement is the act of measuring by assigning symbols or numbers to something according to a specific set of rules. It involves identifying the dimensions, quantity, capacity, type, kind, or degree of something.
6.2. What are the four different eves or s!aes of measurement and what are the essentia !hara!teristi!s of ea!h one? The four levels of measurement are nominal, ordinal, interval, and ratio scales. Note that the first letters spell NOIR !hich means black in "rench#. The most basic level of measurement is the nominal level !hich simply involves assigning symbols or names to identify the groups or categories of something e.g., gender and college ma$or are nominal variables#. The ne%t level of measurement is the ordinal level in !hich the levels take on the ne! property of rank order e.g., students& ranks on an e%am, ' st , ( nd , ) rd , etc. is an ordinal variable#. The ne%t level of measurement is the interval level !hich takes on the ne! property that the distances bet!een ad$acent points is the same in addition to having the property of rank ordering#. *n e%ample is the "ahrenheit temperature scale, !here the difference bet!een +, and +- degrees is the same as the difference bet!een +- and ., degrees. Note ho!ever that you cannot say that ., degrees is t!ice as hot as /, degrees because the 0ero point on an interval scale is arbitrary. The highest level of measurement is the ratio scale !hich has the properties of rank order and equal distances and it has the ne! property of having an absolute or true 0ero point. 1ou have a true 0ero point !hen 0ero means none of the property being measured. *nnual income and height are e%amples. Note no! that a person !ho is si% feet tall is t!ice as tall as a person !ho is three feet tall. 2nlike !ith interval scales, !e can make these types of ratio statements !ith ratio scales e.g., -,3(-4(#.
6.". What are the tweve assumptions under#in$ testin$ and measurement? Note that it takes a lot of hard !ork to make the '( assumptions happen in practice. I !ill list the '( assumptions here5 '. 6sychological traits and states e%ist they are social constructions that name phenomena of interest to researchers as they attempt to understand the !orld# (. 6sychological traits and states can be quantified and measured ). 7arious approaches to measuring aspects of the same thing can be useful /. *ssessment can provide ans!ers to some of life&s most momentous questions -. *ssessment can pinpoint phenomena that require further attention or study 8. 7arious sources of data enrich and are part of the assessment process +. 7arious sources of error are al!ays part of the assessment process .. Tests and other measurement techniques have strengths and !eaknesses 9. Test:related behavior predicts non:test:related behavior ',. 6resent:day behavior sampling predicts future behavior ''. Testing and assessment can be conducted in a fair and unbiased manner '(. Testing and assessment benefit society.
*lso be sure to kno! the three definitions included in this section traits distinguishable, relatively enduring !ays in !hich one individual differs from another#, states, less enduring !ays in !hich individuals vary#, and error the difference bet!een a person&s true score and the person&s observed score#.
6.%. What is the differen!e &etween reia&iit# and vaidit#? Whi!h is more important? Reliability refers to the consistency or stability of the test scores; validity refers to the accuracy of the inferences or interpretations you make from the test scores. <oth of these characteristics are important. Note also that reliability is a necessary but not sufficient condition for validity i.e., you can have reliability !ithout validity, but in order to obtain validity you must have reliability#.
6.'. What are the definitions of reia&iit# and reia&iit# !oeffi!ient? Reliability refers to the consistency or stability of a set of test scores. The reliability coefficient is a correlation coefficient that is used as an inde% of reliability. 2nlike a regular correlation coefficient, ho!ever, the reliability coefficient has a range of , no reliability# to ' perfect reliability#.
Note that there are several different forms of reliability. "irst is test:retest reliability the consistency of a group of individuals& scores over time#. The second type is equivalent: forms reliability consistency of a group of individuals& scores on t!o equivalent forms of a test#. The third type or reliability is internal consistency reliability consistency of items in measuring a single construct#. The t!o subtypes of internal consistency are split:half reliability and coefficient alpha. The fourth ma$or type of reliability is inter:scorer reliability consistency or degree of agreement bet!een t!o or more scorers, $udges, or raters#.
6.6. What are the different wa#s of assessin$ reia&iit#? Most of the types of reliability are assessed !ith simple correlation coefficients called reliability coefficients#. Test:retest reliability is the correlation bet!een a group&s scores on the same test given at t!o different times i.e., give a set of people a test t!ice and see if the t!o sets of scores are correlated#. =quivalent:forms reliability is the correlation bet!een a group&s scores on t!o forms of the same test i.e., give everyone in a group t!o forms of the same test and correlate those t!o sets of scores#. >plit:half reliability is the correlation bet!een a group&s scores on t!o halves of the same test everyone in the group takes the test once and you give everyone a score on both of the t!o halves of the test; then you correlate those t!o sets of scores#. ?oefficient alpha can be vie!ed as the average of the correlations of all of the items on a test !ith each other e.g., if a test only had ) items it !ould be the average of the correlation bet!een items ' and (, ' and ), and ( and )#. It tells you if the items tend to be related. The basic inter:scorer reliability is the correlation bet!een t!o raters& ratings of a set of ob$ects e.g., a set of essay questions#.
6.(. )nder what !onditions shoud ea!h of the different wa#s of assessin$ reia&iit# &e used? Test:retest is used to determine consistency of the scores on a test over time. =quivalent forms reliability is used to see if different forms of a test give consistent results. Internal consistency reliability is used to see if the different items on a test give consistent results. Inter:scorer reliability is used to see if t!o raters of a set of items give consistent results.
6.*. What are the definitions of vaidit# and vaidation? 7alidity is the accuracy of the inferences, interpretations, or actions made on the basis of test scores. 7alidation is the process of gathering evidence that supports the inferences made on the basis of test scores.
6.+. What is meant &# the unified view of vaidit#? It means that all validity can be vie!ed as part of construct validity. That&s because to be discussing measurement validity, there has to be something that !e intend to measure. The term @constructA simply refers to !hat !e !ant to measure !hether it be age, gender, IB, kno!ledge.
6.1,. What are the !hara!teristi!s of the different wa#s of o&tainin$ vaidit# eviden!e? The three ma$or types of evidence include5 '# =vidence based on content. (# =vidence based on internal structure of the test. )# =vidence based on relations to other variables.
This is summari0ed in Table 8.+. >ee your te%tbook#.
6.11. What are the purposes and -e# !hara!teristi!s of the ma.or t#pes of tests dis!ussed in #our te/t&oo-? The ma$or types of tests discussed are5 Intelligence tests goal is to measure one or more types of intelligence# 6ersonality tests goal is to measure one or more dimensions of personality# =ducational assessment tests including preschool assessment tests for identifying @at riskA children, achievement tests for measuring learning from formal learning e%periences, aptitude tests for measuring informal learning that goes on in life, and diagnostic tests for identifying academic difficulties in students#.
6.12. What is a $ood e/ampe of ea!h of the ma.or t#pes of tests that are dis!ussed in this !hapter? >ome e%amples intelligence tests are the >tanford:<inet Intelligence Test, the Cechsler *dult Intelligence >cale, the >losson Intelligence Test. >ome e%amples of personality tests are the Minnesota Multiphasic 6ersonality Inventory, the ?alifornia 6sychological Inventory, Cork 7alues Inventory, Minnesota >chool *ttitude >urvey, and the Thematic *pperception Test. >ome e%amples of educational assessment tests are 6eabody Individual *chievement Test, Nelson Reading >kills Tests, and the <asic =nglish >kills Test.