Sie sind auf Seite 1von 9

A Biserial correlation an index used to

express the relationship between a


Achievement previous learning. continuous variable and an artificially
dichotomous variable.
Acquiescence the tendency to agree or
to endorse a test item as true.

Adverse impact the effect of any test


C
used for selection purposes if it Category format a rating-scale format
systematically rejects substantially higher that often uses the categories 1 to 10
proportions of minority than majority job
applicants. Ceiling the highest score possible on a
test. When the test is too easy, many
Age differentiation discrimination people may get the highest score and the
based on the fact that older children have test cannot discriminate between the top
greater capabilities than do younger level performers and those at lower levels.
children
Class interval the unit for the horizontal
Aptitude potential for learning a specific axis in a frequency distribution
skill
Closed-ended question in
Assessment a procedure used to interviewing, a question that can be
evaluate an individual so that one can answered specifically
describe the person in terms of current
functioning and also so that one can predict Coefficient alpha a generalized method
future functioning. Tests are used in the for estimating reliability; Alpha is similar to
assessment process. the KR20 formula, except that it allows
items to take on values other than 0 and 1.

B Coefficient of alienation in correlation


and regression analysis, the index of non-
Basal the level at which a minimum association between two variables
criterion number of correct responses is
obtained. Coefficient of determination the
correlation coefficient squared; gives an
Basal age in the Stanford-Binet scale, estimate of the percentage of variation in Y
the highest year level at which the subject that is known as a function of knowing X
successfully passes all tests. (and vice versa).

Base rate in decision analysis, the Concurrent validity evidence


proportion of people expected to succeed evidence for criterion validity in which the
on a criterion if they are chosen at random.
test and the criterion are administered at the Criterion validity evidence the
same point in time evidence that a test score corresponds to
an accurate measure of interest. The
Confrontation a statement that points measure of interest is called the criterion.
out a discrepancy or inconsistency.
Cross validation the process of
Construct validity evidence a process evaluating a test or a regression equation
used to establish the meaning of a test for a sample other than the one used for the
through a series of studies wherein a original studies.
researcher simultaneously defines some
construct and develops the instrumentation
to measure it.
D
Content validity the
evidence Deciles points that divide the frequency
evidence that the content of a test distribution into equal tenths.
represents the conceptual domain it is
designed to cover. Descriptive statistics methods used to
provide a concise description of a collection
Convergent evidence evidence of quantitative information.
obtained to demonstrate that a test
measures the same attribute as do other Developmental quotient (DQ) in the
measures that purport to measure the same Gesell Developmental Schedules, a test
thing. A form of construct validity evidence score that is obtained by assessing the
presence or absence of behaviors
Correction for attenuation the associated with maturation.
correction for attenuation formula is used to
estimate what the correlation would have Dichotomous format a test item format
been if the variables had been perfectly in which there are two alternatives for each
reliable. item.

Correlation coefficient a mathematical Differential validity the extent to which


index used to describe the direction and the a test has different meanings for different
magnitude of a relationship between two groups of people
variables. The correlation coefficient ranges
between 1.0 and 1.0.
Discriminability in item analysis, how
well an item performs in relation to some
Criterion-referenced test a test that criterion; for example, items may be
describes the specific types of skills, tasks, compared according to how well they
or knowledge of an individual relative to a separate groups who score high and low on
well-defined mastery criterion. The content the test. The index of discrimination would
of criterion-referenced tests is limited to then be the association between
certain well-defined objectives. performance on an item and performance
on the whole test.
Discriminant analysis a multivariate Factor analysis a set of multivariate
data analysis method for finding the linear data analysis methods for reducing large
combination of variables that best describes matrixes of correlations to fewer variables
the classification of groups into discrete
categories. False negative in test-decision theory, a
case in which the test suggests a negative
Discriminant evidence evidence classification, yet the correct classification is
obtained to demonstrate that a test positive.
measures something different from what
other available tests measure. False positive in test-decision analysis,
a case in which the test suggests a positive
Distractors alternatives on a multiple classification, yet the correct classification is
choice exam that are not correct or for negative.
which no credit is given.
Four-fifths rule a rule used by federal
Drift the tendency for observers in agencies in deciding whether there is equal
behavioral studies to stray from the employment opportunity. Any procedure
definitions they learned during training and that results in a selection rate for any race,
to develop their own idiosyncratic definitions gender, or ethnic group that is less than four
of behaviors. fifths (80%) of the selection rate for the
group with the highest rate is regarded as
having an adverse impact.

E Frequency distribution the systematic


Evaluative statement a statement in arrangement of scores on a measure to
interviewing that judges or evaluates. reflect how frequently each value on the
measure occurred
Expectancy effect the tendency for
results to be influenced by what
experimenters or test administrators expect
to find (also known as the Rosenthal
G
effect, after the psychologist who has General cognitive index (GCI) in the
studied this problem intensively) McCarthy Scales of Childrens Abilities, a
standard score with a mean of 100 and
standard deviation of 16

F Group test a test that a single test


Face validity the extent to which items administrator can give to more than one
on a test appear to be meaningful and person at a time
relevant. Actually not evidence for validity
because face validity is not a basis for
inference.
H
Hit rate in test-decision analysis, the
proportion of cases in which a test Interquartile range the interval of
accurately predicts success or failure. scores bounded by the 25th and the 75th
percentiles
Human ability behaviors that reflect
either what a person has learned or the Interval scale a scale that one can use
persons capacity to emit a specific to rank order objects and on which the units
behavior; reflect equivalent magnitudes of the
property being measured

I Interview a method of gathering


information by talk, discussion, or direct
Individual tests tests that can be given questions.
to only one person at a time
Ipsative score a test result presented in
Inferences logical deductions (from relative rather than absolute terms. Ipsative
evidence) about something that one cannot scores compare the individual against him-
observe directly. or herself. Each person thus provides his or
her own frame of reference.
Inferential statistics methods used to
make inferences from a small group of Isodensity curve an ellipse on a
observations, called a sample. These scatterplot (or two-dimensional scatter
inferences are then applied to a larger diagram) that encircles a specified
group of individuals, known as a population. proportion of the cases constituting
Typically, the researcher wants to make particular groups
statements about the larger group but
cannot make all of the necessary Item a specific stimulus to which a person
observations. responds overtly and that can be scored or
evaluated.
Intelligence general potential
independent of previous learning. Item analysis a set of methods used to
evaluate test items. The most common
Intelligence quotient (IQ) a unit for techniques involve assessment of item
expressing the results of intelligence tests; difficulty and item discriminability.
The intelligence quotient is based on the
ratio of the individuals mental age (MA) (as Item characteristic curve a graph
determined by the test) to actual or prepared as part of the process of item
chronological age (CA): IQ = MA/CA 100. analysis. One graph is prepared for each
test item and shows the total test score on
Intercept on a two-dimensional graph, the X axis and the proportion of test takers
the point on the Y axis where X equals 0. In passing the item on the Y axis.
regression, this is the point at which the
regression line intersects the Y axis.
Item difficulty a form of item analysis
used to assess how difficult items are. The Mental age a unit for expressing the
most common index of difficulty is the results of intelligence tests; his unit is based
percentage of test takers who respond with on comparing the individuals performance
the correct choice. on the test with the average performance of
individuals in a specific chronological age
group.

K Multiple regression a multivariate data


Kuder-Richardson 20 a formula for analysis method that considers the
estimating the internal consistency of a test, relationship between a continuous outcome
KR20 method is equivalent to the average variable and the linear combination of two or
split-half correlation obtained from all more predictor variables.
possible splits of the items. For the KR20
formula to be applied, all items must be Multivariate analysis a set of methods
scored either 0 or 1. for data analysis that considers the
relationships between combinations of three
or more variables

L
Likert format a format for attitude scale
items in which subjects indicate their degree
N
of agreement to statements Nominal scales systems that arbitrarily
assign numbers to objects, mathematical
manipulation of numbers from a nominal

M scale is not justified. For example, numbers


on the backs of football players uniforms
are a nominal scale.
McCalls T a standardized score system
with a mean of 50 and a standard deviation
Normative sample a comparison group
of 10. McCalls T can be obtained from a
consisting of individuals who have been
simple linear transformation of Z scores (T =
administered a test under standard
10Z + 50).
conditionsthat is, with the instructions,
format, and general procedures outlined in
Mean the arithmetic average of a set of
the test manual for administering the test
scores on a variable.

Norm-referenced test a test that


Measurement error the component of
evaluates each individual relative to a
an observed test score that is neither the
normative group.
true score nor the quality you wish to
measure.
Norms a summary of the performance of
a group of individuals on which a test was
Median the point on a frequency
standardized. The norms usually include the
distribution marking the 50th percentile.
mean and the standard deviation for the obtained score and converting the resulting
reference group and information on how to values to percentiles.
translate a raw score into a percentile rank.
Percentile rank the proportion of scores
that fall below a particular score
O Performance scale a test that consists
One-tailed test a directional test of the of tasks that require a subject to do
null hypothesis. With a one-tailed test, the something rather than to answer questions
experimenter states the specific end of the
null distribution that should be used for the Personality tests tests that measure
region of rejection of the null hypothesis. overt and covert dispositions of individuals
Personality tests measure typical human
Open-ended question a question that behavior.
usually cannot be answered specifically,
such questions require the interviewee to Point scale a test in which points (0, 1,
produce something spontaneously. or 2, for example) are assigned to each
item. In a point scale, all items with a
Ordinal scale a scale that one can use particular content can be grouped together.
to rank order objects or individuals
Polytomous format a format for
objective tests in which three or more

P alternative responses are given for each


item. This format is popular for multiple-
choice exams.
Parallel forms reliability the method of
reliability assessment used to evaluate the
error associated with the use of a particular
Predictive validity evidence the
set of items. Equivalent forms of a test are evidence that a test forecasts scores on the
developed by generating two forms using criterion at some future time
the same rules. The correlation between the
two forms is the estimate of parallel forms Probing statement a statement in
reliability. interviewing that demands more information
than the interviewee has been willing to
Pearson product moment correlation provide of his or her own accord.
an index of correlation between two
continuous variables Projective hypothesis the proposal
that when a person attempts to understand
Percentile band the range of an ambiguous or vague stimulus, his or her
percentiles that are likely to represent a interpretation reflects needs, feelings,
subjects true score, it is created by forming experiences, prior conditioning, thought
an interval one standard error of processes, and so forth.
measurement above and below the
Projective personality tests tests in to be higher when an observer knows that
which the stimulus or the required response his or her work is being monitored.
or both are ambiguous
Reassuring statement a statement
Prophecy formula a formula developed intended to comfort or support.
by Spearman and Brown that one can use
to correct for the loss of reliability that Receptive vocabulary in the Peabody
occurs when the split half method is used Picture Vocabulary Test, a nonverbal
and each half of the test is one-half as long estimate of verbal intelligence; in general,
as the whole test. The method can also be the ability to understand language
used to estimate how much the test length
must be increased to bring the test to a Regression line the best-fitting straight
desired level of reliability. line through a set of points in a scatter
diagram
Psychological test a device for
measuring characteristics of human beings Reliability the extent to which a score or
that pertain to overt and covert behavior measure is free of measurement error.
Theoretically, reliability is the ratio of true
Psychological testing the use of score variance to observed score variance.
psychological tests. Psychological testing
refers to all of the possible uses, Representative sample a sample
applications, and underlying concepts of drawn in an unbiased or random fashion so
psychological tests. that it is composed of individuals with
characteristics similar to those for whom the
test is to be used

Q Residual the difference between


Quartiles points that divide the predicted and observed values from a
frequency distribution into equal fourths. regression equation.

Response style the tendency to mark a

R test item in a certain way irrespective of


content
Randomly parallel tests tests created
by successive random sampling of items
Restricted range in correlation and
from a domain or universe of items. regression, variability on one measure is
used to forecast variability on a second
measure. If the variability is restricted on
Ratio scale an interval scale with an
either measure, the observed correlation is
absolute zero, or point at which there is
likely to be low.
none of the property being measured.

Reactivity the phenomenon that causes


the reliability of a scale in behavior studies
S Standard error of estimate is an index
of the accuracy of a regression equation. It
Scales tools that relate raw scores on is equivalent to the standard deviation of the
test items to some defined theoretical or residuals from a regression analysis.
empirical distribution. Prediction is most accurate when the
standard error of estimate is small.
Scatter diagram a picture of the
relationship between two variables. For Standard error of measurement is an
each individual, a pair of observations is index of the amount of error in a test or
obtained, and the values are plotted in a measure. The standard error of
two-dimensional space created by variables measurement is a standard deviation of a
X and Y. set of observations for the same test.

Selection ratio in test decision analysis, Standardization sample a comparison


the proportion of applicants who are group consisting of individuals who have
selected been administered a test under standard
conditionsthat is, with the instructions,
Self-report questionnaire a format, and general procedures outlined in
questionnaire that provides a list of the test manual for administering the test
statements about an individual and requires
him or her to respond in some way to each, Standardized interview an interview
such as True or False conducted under standard conditions that
are well defined in a manual or procedure
Shrinkage the amount of decrease in the book.
strength of the relationship from the original
sample to the sample with which the Stanine system a system for assigning
equation is used the numbers 1 through 9 to a test score.
The system was developed by the U.S. Air
Spearmans rho a method for finding Force. The standardized stanine distribution
the correlation between two sets of ranks has a mean of 5 and a standard deviation of
approximately 2.
Split-half reliability a method for
evaluating reliability in which a test is split
Stress response to situations that pose
into halves demands, place constraints, or give
opportunities.
Standard administration the
procedures outlined in the test manual for
Structured personality tests tests
administering a test. that provide a statement, usually of the self-
report variety and require the subject to
choose between two or more alternative
Standard deviation it is used as a
responses
measure of variability in a distribution of
scores.
True score thee score that would be
T obtained on a test or measure if there were
no measurement error. In practice, the true
Taylor-Russell tables a series of score can be estimated but not directly
tables one can use to evaluate the validity observed.
of a test in relation to the amount of
information it contributes beyond what Two-tailed test is a non-directional test
would be known by chance. of the null hypothesis. The two-tailed test is
used to evaluate whether observations are
Test a measurement device that significantly different from chance in either
quantifies behavior. the upper or lower end of the sampling
distribution.
Test administration the act of giving a
test.

Test administrator person giving a test.


U
Understanding response a statement
Test battery a collection of tests, the that communicates understanding
scores of which are used together in
appraising an individual.

Testretest reliability a method for


V
estimating how much measurement error is Validity the extent to which a test
caused by time sampling, or administering measures the quality it purports to measure.
the test at two different points in time. Test
retest reliability is usually estimated from the Variance the average squared deviation
correlation between performances on two around the mean; the standard deviation
different administrations of the test. squared.

Tracking the tendency to stay at about


the same level of growth or performance
relative to peers who are the same age.

Trait anxiety a personality characteristic


reflecting the differences among people in
the intensity of their reaction to stressful
situations

Traits enduring or persistent


characteristics of an individual that is
independent of situations.

Das könnte Ihnen auch gefallen