Beruflich Dokumente
Kultur Dokumente
All chance factors that confound the measurement of any phenomenon. They tend to cancel each other out in the long run in direction and magnitude.
Transient qualities of the individual (mood, motivation, degree of alertness, boredom, or fatigue) Situational factors involve the physical setting such as noise level, lighting, ventilation,etc.) anonymity, presence of peers). Administrative factors involve the actual administration of the instrument or the amount of subjectivity influencing the measurement process.
Systematic Error
Refers to those factors that consistently or systematically affect the variable being measured Two most common sources are demographic characteristics (education, SES, etc.) and personal style (response set).
Experimental Error Experiments are designed to measure the impact of one or more independent variables on a dependent variable. Experimental error occurs when the effect of experimental situation itself is measured rather than the effect of independent variable. For example, a retail Business Research (8510) Page 1
Page 2
Page 3
Page 4
Classification but no Classification but Classification, ordered Classification, order, order, distance or order but no distance and distance but no distance and unique origin or unique origion unique origin origin Determinition equality Only Label of Determinination of Determination of Determination greater or lesser equality of intervals or equality of ratios value differences Ranks, Rating and equal grouping Grade Weight, height of
Doneness of meat, Gener (male, female) (well, medium well, temperature in degrees medium rare, rare) Counting Frequency Distribution AAA, BBB, CCC
Age in years
Addition/subtraction but no multiplication or All functions division personality measure Can say no measurable value like zero sales
Levels, one-star & Mean, range, variance, Annual Income 4-star standard deviation
Page 5
Page 6
Refers to the consistency, repeatability, and reproducibility of empirical measurements It refers to whether a particular technique, applied repeatedly to the same object, would yield consistent results each time.
This is where the same test is given to the same people after a period of time. After the retest, we have two scores on the same measure for each person. Correlation between scores of the test is obtained.
Alternate Form
Here, two separate but equivalent (designed to be as similar as possible) versions of an instrument is constructed and administered successively to the same subjects. Page 7
The problem of being able to construct an alternative form parallel to the original one.
Split-Half
Here, the total number of indicants is divided into two halves by separating the odd-numbered items from the even-numbered ones. o The two halves are correlated by using an appropriate measure of association.
Internal Consistency
Used to determined the homogeneity of items. That is, do the items measure the same property? Computing Cronbachs alpha (@) is a common way to assess internal consistency. It measures internal consistency by taking random samples of items on the test and correlating the scores obtained from these samples with each other. If it proves to be very low, then it means the items have very little in common. For example, if alpha is below 0.5, then it means that there is low correlation.
It examines the extent to which different interviewers, observers, or coders using the same instrument or measure get equivalent results.
Improving Reliability
Exploratory studies, preliminary interviews, or pretests of a measure with a small sample of persons with similar characteristics to the target group. Adding items of the same type to a scale. A composite measure containing more items will normally be more reliable than a composite measure having fewer. An item-by-item analysis to reveal which items discriminate well between units with different values on a particular variable. Instructions to respondents Page 8
Validity refers to the accuracy of a measure. It is the extent to which a measuring instrument actually measures the underlying concept it is suppose to measure. It refers to the extent of matching, congruence, or goodness of fit between an operational definition and the concept it is purported to measure. An instrument is said to be valid if it taps the concept it is suppose to measure. It is designed to answer the question-is it true.
This is the extent to which a measuring instrument reflects a specific domain of content. It can also be viewed as the sampling adequacy of the content of a phenomena being measured. This type of validity is often used in the assessment of various educational and psychological tests. Content validation then, is essentially judgmental.
Specifying the full domain of content relevant to a particular measurement situation. No agreed upon criterion for determining content validity.
Criterion-Related Validity
This is at issue when the purpose is to use an instrument to estimate some important form of behavior that is external to the measuring instrument itself, the latter being referred to as the criterion. Page 9
Concurrent Validity
Refers to the ability of a measure to accurately predict the current situation or status of an individual. Where the instrument being assessed is compared to some already existing criterion, such as the results of another measuring device.
Predictive Validity
This is where an instrument is used to predict some future state of affairs. An example here is the various educational tests used for selection purposes in different occupations and schools; the SAT, the GRE, etc. If people who score high on the SAT or GRE do better in college than lowscorers, then the SAT or GRE test is presumably a valid measure of Scholastic aptitude (in the case of SAT). The prison system uses this to assess criminals who are less likely to recidivist. They use factors such as age, type of crime, family background, etc.
From the definition of criterion-related validity, it can be inferred that the degree of criterion-related validity depends on the extent of the correspondence between the test and the criterion. Most measures in the social sciences have no well delimited relevant criterion variables against which measures can be reasonably evaluated.
Construct validity
This is evaluated by examining the degree to which certain explanatory concepts (constructs) derived from theory, account for performance on a measure.
Page 10
Convergent Validity
This is based on the idea that two instruments that are valid measures of the same concept should correlate rather highly with one another or yield similar results even though they are different instruments.
Discriminant Validity
This is based on the idea that two instruments, although similar to one another, should not correlate highly if they measure different concepts. This approach thus involves the simultaneous assessment of numerous instruments (multimethod) and numerous concepts (multitrait) through the computation of intercorrelations.
The process of validation is theory-laden. It is thus almost impossible to 'validate' a measure of a concept unless there is in existence a theoretical network that surrounds it.
An instrument that is valid is always reliable An instrument that is not valid may or may not be reliable An instrument that is reliable may or may not be valid An instrument that is not reliable is never valid. Reliability is a necessary, but not sufficient, condition for good measurement.
Page 11