Sie sind auf Seite 1von 6

102 CHAPTER 5 Methods for Assessing and Selecting Employees

parallel forms A second method of estimating the reliability of an employment screening


a method of establish- measure is the parallel forms method. Here two equivalent tests are constructed,
ing the reliability of a each of which presumably measures the same construct but using different
measurement instru- items or questions. Test-takers are administered both forms of the instrument.
ment by correlating
Reliability is empirically established if the correlation between the two scores is
scores on two different
but equivalent versions high. Of course, the major drawbacks to this method are the time and difficulty
of the same instrument involved in creating two equivalent tests.
Another way to estimate the reliability of a test instrument is by estimating
its internal consistency. If a test is reliable, each item should measure the same
internal consistency general construct, and thus performance on one item should be consistent with
a common method of performance on all other items. Two specific methods are used to determine
establishing a measure-
internal consistency. The first is to divide the test items into two equal parts
ment instrument’s
reliability by examining
and correlate the summed score on the first half of the items with that on the
how the various items second half. This is referred to as split-half reliability. A second method, which
of the instrument involves numerous calculations (and which is more commonly used), is to
intercorrelate determine the average intercorrelation among all items of the test. The result-
ing coefficient, referred to as Cronbach’s alpha, is an estimate of the test’s inter-
nal consistency. In summary, reliability refers to whether we can “depend” on a
set of measurements to be stable and consistent, and several types of empirical
evidence (e.g., test–retest, equivalent forms, and internal consistency) reflect
different aspects of this stability.
validity Validity refers to the accuracy of inferences or projections we draw from
a concept referring to measurements. Validity refers to whether a set of measurements allows accurate
the accuracy of a mea-
inferences or projections about “something else.” That “something else” can be
surement instrument
and its ability to make
a job applicant’s standing on some characteristic or ability, it can be future job
accurate inferences success, or it can be whether an employee is meeting performance standards. In
about a criterion the context of employee screening, the term validity most often refers to whether
scores on a particular test or screening procedure accurately project future job
performance. For example, in employee screening, validity refers to whether a
score on an employment test, a judgment made from a hiring interview, or a
conclusion drawn from the review of information from a job application does
indeed lead to a representative evaluation of an applicant’s qualifications for a
job, and whether the specific measure (e.g., test, interview judgment) leads to
accurate inferences about the applicant’s criterion status (which is usually, but
not always, job performance). Validity refers to the quality of specific inferences
or projections; therefore, validity for a specific measurement process (e.g., a spe-
cific employment test) can vary depending on what criterion is being predicted.
Therefore, an employment test might be a valid predictor of job performance,
but not a valid predictor of another criterion such as rate of absenteeism.
Similar to our discussion of reliability, validity is a unitary concept, but there
are three important facets of, or types of evidence for, determining the validity
content validity
of a predictor used in employee selection (see Binning & Barrett, 1989; Schultz,
the ability of the items
in a measurement
Riggs, & Kottke, 1999). A predictor can be said to yield valid inferences about
instrument to measure future performance based on a careful scrutiny of its content. This is referred to
adequately the various as content validity. Content validity refers to whether a predictor measurement
characteristics needed process (e.g., test items or interview questions) adequately sample important
to perform a job job behaviors and elements involved in performing a job. Typically, content
Employee Screening and Assessment 103

validity is established by having experts such as job incumbents or supervisors


judge the appropriateness of the test items, taking into account information
from the job analysis (Hughes & Prien, 1989). Ideally, the experts should deter-
mine that the test does indeed sample the job content in a representative way. It
is common for organizations constructing their own screening tests for specific
jobs to rely heavily on this content-based evidence of validity. As you can guess,
content validity is closely linked to job analysis.
A second type of validity evidence is called construct validity, which refers construct validity
to whether a predictor test, such as a pencil-and-paper test of mechanical ability refers to whether
used to screen school bus mechanics, actually measures what it is supposed to an employment test
measures what it is
measure—(a) the abstract construct of “mechanical ability” and (b) whether supposed to measure
these measurements yield accurate predictions of job performance. Think of
it this way: most applicants to college take a predictor test of “scholastic apti-
tude,” such as the SAT (Scholastic Aptitude Test). Construct validity of the SAT
deals with whether this test does indeed measure a person’s aptitude for school-
work, and whether it allows accurate inferences about future academic success.
(Students taking the SAT may agree or disagree with how accurately the SAT
measures their personal scholastic aptitude—likely related to their scores on
the test.) There are two common forms of empirical evidence about construct
validity. Well-validated instruments such as the SAT, and standardized employ-
ment tests, have established construct validity by demonstrating that these tests
correlate positively with the results of other tests of the same construct. This
is referred to as convergent validity. In other words, a test of mechanical ability
should correlate (converge) with another, different test of mechanical ability.
In addition, a pencil-and-paper test of mechanical ability should correlate with
a performance-based test of mechanical ability. In establishing a test’s construct
validity, researchers are also concerned with divergent, or discriminant, validity—
the test should not correlate with tests or measures of constructs that are totally
unrelated to mechanical ability. Similarly to content validity, credible judg-
ments about a test’s construct validity require sound professional judgments
about patterns of convergent and discriminant validity
Criterion-related validity is a third type of validity evidence and is empirically criterion-related
demonstrated by the relationship between test scores and some measurable cri- validity
the accuracy of a
terion of job success, such as a measure of work output or quality. There are
measurement instru-
two common ways that predictor–criterion correlations can be empirically gen- ment in determin-
erated. The first is the follow-up method (often referred to as predictive validity). ing the relationship
Here, the screening test is administered to applicants without interpreting the between scores on the
scores and without using them to select among applicants. Once the applicants instrument and some
criterion of job success
become employees, criterion measures such as job performance assessments
are collected. If the test instrument is valid, the test scores should correlate
with the criterion measure. Once there is evidence of the predictive validity
of the instrument, test scores are used to select the applicants for jobs. The
obvious advantage of the predictive validity method is that it demonstrates how
scores on the screening instrument actually relate to future job performance.
The major drawback to this approach is the time that it takes to establish valid-
ity. During this validation period, applicants are tested, but are not hired based
on their test scores.
104 CHAPTER 5 Methods for Assessing and Selecting Employees

In the second approach, known as the present-employee method (also termed


concurrent validity), the test is given to current employees, and their scores are
correlated with some criterion of their current performance. Again, a relation-
ship between test scores and criterion scores supports the measure’s validity.
Once there is evidence of concurrent validity, a comparison of applicants’ test
scores with the incumbents’ scores is possible. Although the concurrent valid-
ity method leads to a quicker estimate of validity, it may not be as accurate an
assessment of criterion-related validity as the predictive method, because the
job incumbents represent a select group, and their test performance is likely
to be high, with a restricted range of scores. In other words, there are no test
scores for the “poor” job performers, such as workers who were fired or quit
their jobs, or applicants who were not chosen for jobs. Interestingly, available
research suggests that the estimates of validity derived from both methods are
generally comparable (Barrett, Phillips, & Alexander, 1981).
All predictors used in employee selection, whether they are evaluations of
application materials, employment tests, or judgments made in hiring inter-
views, must be reliable and valid. Standardized and commercially available psy-
chological tests have typically demonstrated evidence of reliability and validity
for use in certain circumstances. However, even with widely used standardized
tests, it is critical that their ability to predict job success be established for the
particular positions in question and for the specific criterion. It is especially nec-
essary to assure the reliability and validity of nonstandardized screening meth-
ods, such as a weighted application form or a test constructed for a specific job.

TYPES OF EMPLOYEE SCREENING TESTS


The majority of employee screening and selection instruments are standard-
ized tests that have been subjected to research aimed at demonstrating their
validity and reliability. Most also contain information to ensure that they are
administered, scored, and interpreted in a uniform manner. The alternative
to the use of standardized tests is for the organization to construct a test for a
particular job or class of jobs, and conduct its own studies of the test’s reliability
and validity. However, because this is a costly and time-consuming procedure,
most employers use standardized screening tests. While many of these tests are
Stop & Review published in the research literature, there has been quite a bit of growth in
What are three facets consulting organizations that assist companies in testing and screening. These
of validation that organizations employ I/O psychologists to create screening tests and other
are important for assessments that are proprietary and used in their consulting work. More and
employee screening
more, companies are outsourcing their personnel testing work to these con-
tests?
sulting firms.

Test formats
Test formats, or the ways in which tests are administered, can vary greatly.
Several distinctions are important when categorizing employment tests.
Individual versus group tests—Individual tests are administered to only one
person at a time. In individual tests, the test administrator is usually
more involved than in group tests. Typically, tests that require some kind
Employee Screening and Assessment 105

of sophisticated apparatus, such as a driving simulator, or tests that re-


quire constant supervision are administered individually, as are certain
intelligence and personality tests. Group tests are designed to be given
simultaneously to more than one person, with the administrator usu-
ally serving as only a test monitor. The obvious advantage to group tests
is the reduced cost for administrator time. More and more, tests of all
types are being administered online, so the distinction between indi-
vidual and group testing are becoming blurred, as many applicants can
complete screening instruments online simultaneously.
Speed versus power tests—Speed tests have a fixed time limit. An important
focus of a speed test is the number of items completed in the time pe-
riod provided. A typing test and many of the scholastic achievement tests
are examples of speed tests. A power test allows the test-taker sufficient
time to complete all items. Typically, power tests have difficult items,
with a focus on the percentage of items answered correctly.
Paper-and-pencil versus performance tests—“Paper-and-pencil tests” refers to
both paper versions of tests and online tests, which require some form
of written reply, in either a forced choice or an open-ended, “essay” for-
mat. Many employee screening tests, and nearly all tests in schools, are
of this format. Performance tests, such as typing tests and tests of manual
dexterity or grip strength, usually involve the manipulation of physical
objects.
As mentioned, many written-type tests are now administered via com-
puter (usually Web-based), which allows greater flexibility in how a test can

Some employment tests involve sophisticated technology, such as this flight simulator
used to train and test airline pilots.
106 CHAPTER 5 Methods for Assessing and Selecting Employees

be administered. Certain performance-based tests can also be administered via


computer simulations (see box “On the Cutting Edge,” p. 116).
Although the format of an employment test is significant, the most important
way of classifying the instruments is in terms of the characteristics or attributes
they measure such as biographical information (biodata instruments), cognitive
abilities, mechanical abilities, motor and sensory abilities, job skills and knowl-
edge, or personality traits (see Table 5.1 for examples of these various tests).

TABLE 5.1
Some Standardized and Well-Researched Tests Used in Employee
Screening and Selection
Cognitive Ability Tests

Comprehensive Ability Battery (Hakstian & Cattell, 1975–82): Features 20 tests, each designed
to measure a single primary cognitive ability, many of which are important in industrial settings.
Among the tests are those assessing verbal ability, numerical ability, clerical speed and accuracy,
and ability to organize and produce ideas, as well as several memory scales.
Wonderlic Cognitive Ability Test (formerly the Wonderlic Personnel Test) (Wonderlic, 1983):
A 50-item, pencil-and-paper test measuring the level of mental ability for employment, which is
advertised as the most widely used test of cognitive abilities by employers.
Wechsler Adult Intelligence Scale-Revised or WAIS-R (Wechsler, 1981): A comprehensive group
of 11 subtests measuring general levels of intellectual functioning. The WAIS-R is administered
individually and takes more than an hour to complete.
Mechanical Ability Tests

Bennett Mechanical Comprehension Test (Bennett, 1980): A 68-item, pencil-and-paper test of


ability to understand the physical and mechanical principles in practical situations. Can be group
administered; comes in two equivalent forms.
Mechanical Ability Test (Morrisby, 1955): A 35-item, multiple-choice instrument that
measures natural mechanical aptitude. Used to predict potential in engineering, assembly work,
carpentry, and building trades.
Motor and Sensory Ability Tests

Hand-Tool Dexterity Test (Bennett, 1981): Using a wooden frame, wrenches, and screwdrivers, the
test-taker takes apart 12 bolts in a prescribed sequence and reassembles them in another posi-
tion. This speed test measures manipulative skills important in factory jobs and in jobs servicing
mechanical equipment and automobiles.
O’Connor Finger Dexterity Test (O’Connor, 1977): A timed performance test measuring fine
motor dexterity needed for fine assembly work and other jobs requiring manipulation of small
objects. Test-taker is given a board with symmetrical rows of holes and a cup of pins. The task is
to place three pins in each hole as quickly as possible.
Job Skills and Knowledge Tests

Minnesota Clerical Assessment Battery or MCAB (Vale & Prestwood, 1987): A self-administered
battery of six subtests measuring the skills and knowledge necessary for clerical and secretarial
work. Testing is completely computer-administered. Included are tests of typing, proofreading,
filing, business vocabulary, business math, and clerical knowledge.
Employee Screening and Assessment 107

TABLE 5.1
Some Standardized and Well-Researched Tests Used in Employee
Screening and Selection (continued)
Purdue Blueprint Reading Test (Owen & Arnold, 1958): A multiple-choice test assessing the
ability to read standard blueprints.
Various Tests of Software Skills. Includes knowledge-based and performance-based tests of
basic computer operations, word processing, and spreadsheet use.
Personality Tests

California Psychological Inventory or CPI (Gough, 1987): A 480-item, pencil-and-paper inventory


of 20 personality dimensions. Has been used in selecting managers, sales personnel, and leader-
ship positions.
Hogan Personnel Selection Series (Hogan & Hogan, 1985): These pencil-and-paper tests
assess personality dimensions of applicants and compares their profiles to patterns of success-
ful job incumbents in clerical, sales, and managerial positions. Consists of four inventories: the
prospective employee potential inventory, the clerical potential inventory, the sales potential
inventory, and the managerial potential inventory.
Sixteen Personality Factors Questionnaire or 16 PF (Cattell, 1986): Similar to the CPI, this
test measures 16 basic personality dimensions, some of which are related to successful job
performance in certain positions. This general personality inventory has been used extensively in
employee screening and selection.
Revised NEO Personality Inventory or NEO-PI-R (Costa & McCrae, 1992). A very popular
personality inventory used in employee screening and selection. This inventory measures the five
“core” personality constructs of Neuroticism (N), Extraversion (E), Openness (O), Agreeableness
(A), and Conscientiousness (C).
Bar-On Emotional Quotient Inventory (EQ-I; Bar-On, 1997) and the Mayer–Salovey–Caruso
Emotional Intelligence Test (MSCEIT) (Mayer, Caruso, & Salovey, 1999). Two measures of
emotional intelligence.

Biodata instruments
As mentioned earlier, biodata refers to background information and personal biodata
characteristics that can be used in a systematic fashion to select employees. background informa-
Developing biodata instruments typically involves taking information that tion and personal
would appear on application forms and other items about background, per- characteristics that can
be used in employee
sonal interests, and behavior and using that information to develop a form of
selection
forced-choice employment test. Along with items designed to measure basic
biographical information, such as education and work history, the biodata
instrument might also involve questions of a more personal nature, probing
the applicant’s attitudes, values, likes, and dislikes (Breaugh, 2009; Stokes,
Mumford, & Owens, 1994). Biodata instruments are unlike the other test
instruments we will discuss because there are no standardized biodata instru-
ments. Instead, biodata instruments take a great deal of research to develop
and validate. Because biodata instruments are typically designed to screen
applicants for one specific job, they are most likely to be used only for higher-
level positions. Research indicates that biodata instruments can be effective
screening and placement tools (Dean, 2004; Mount, Witt, & Barrick, 2000;

Das könnte Ihnen auch gefallen