Sie sind auf Seite 1von 44

The Assessment of Physical Capabilities in the Workplace

Oxford Handbooks Online


The Assessment of Physical Capabilities in the
Workplace  
Todd A. Baker and Deborah L. Gebhardt
The Oxford Handbook of Personnel Assessment and Selection
Edited by Neal Schmitt

Print Publication Date: Mar 2012


Subject: Psychology, Organizational Psychology, Psychological Methods and Measurement
Online Publication Date: Nov 2012 DOI: 10.1093/oxfordhb/9780199732579.013.0013

Abstract and Keywords

The world of work has many arduous jobs that require the worker to possess greater
levels of physical ability than found in the normal population. This chapter provides an
overview of the underlying physiological principles associated with physical performance
and methods to assess arduous jobs in the workplace. It includes an overview of test
development and validation of physical tests and litigation related to their use in job
selection and retention. The benefits of physical testing and the methods for reducing
adverse impact are highlighted.

Keywords: physical, physiological, physical ability

Page 1 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Introduction
Measurement of physical capabilities has its roots in the fields of medicine and exercise
science. One of the initial relationships between industry needs and physical performance
dates back to the Harvard Fatigue Laboratory, which opened in 1927 as a laboratory of
human physiology. The purpose of the laboratory was to study the psychological,
physiological, and sociological stresses on human behavior and to apply that knowledge
to better understand relevant problems in labor and industry. Numerous physiologists
worked at the Harvard Fatigue Laboratory and produced a variety of research and
measurement protocols some of which are in use today (e.g., aerobic assessment). This
research base was used by the U.S. Army in the 1940s to assess soldier performance. In
the 1950s and 1960s, Astrand laid the groundwork for the assessment of work activities
by providing numerous research studies that assessed the workers physiological response
to performance of job tasks. It is this pioneering research by exercise physiologists and
psychologists from this era that led to a more accurate assessment of the demands of
arduous work activities. Other research provides a more detailed overview of physical
assessment (Buskirk, 1992; Hogan, 1991a).

Although technology has removed many of the physical demands from the work setting,
there still remains a cadre of jobs with moderate to high physical requirements. These
jobs range from lower skilled work of a manual materials handler to a line worker who
installs high-voltage equipment while standing on a utility pole 40 feet above the ground.
For example, the manual materials handler must lift and move objects weighing 5 to 80
lb. in a warehouse. At present, technology has not been implemented to remove the
worker from this process. Bucket trucks have been implemented for lineworkers to limit
the frequency with which they need to climb poles. However, bucket trucks cannot access
all work locations and in these instances lineworkers must climb to a height of 40 feet
using spikes attached to each shoe, stand on the spikes, and hoist and install heavy
equipment (e.g., 60 lb.).

Employers recognized that not all applicants for jobs with higher physical and motor
demands are qualified to perform the target job and began to implement preemployment
assessments to ensure (p. 275) a minimum level of job competency. For example, most
public safety agencies (e.g., fire, law enforcement) use physical tests to determine
whether candidates possess the ability to drag a charged hose up a flight of stairs,
restrain suspects, and perform other job tasks. These tests became scrutinized when
women began to apply for nontraditional jobs and subsequently failed the
physicalassessment. This led to increased legal scrutiny of physical tests and their impact
on women applicants. At the same time, employers recognized that injury rates and
worker compensation costs were higher for workers in physically demanding jobs.

Page 2 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

The dilemma of instituting a physical assessment to ensure applicants had the ability to
perform a job and the potential of litigation came to a head in Berkman v. City of New
York (1982). Berkman failed the firefighter entrance physical test. The court ruled that
the test was invalid due to a lack of connection between the job analysis and the test. The
court also criticized the test administration procedures, scoring, and need to consider
individual differences in task performance. This chapter will address these issues, the
benefits of physical testing, and the methods to prepare for physical tests.

Page 3 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Benefits of Preemployment Physical Tests


Physical performance tests are used not only for selecting candidates, but also for job
retention, promotion, and return to work. Organizations implement these tests for
reasons such as reducing injuries and related costs, decreasing turnover, and identifying
individuals who possess the capabilities to successfully perform the job. For manual
materials handling jobs the number of injuries is substantial, and the turnover rate can be
as high as 200% per year. Implementation of physical selection tests (e.g., aerobic
capacity) in manual materials handling jobs (e.g., freight industry) found that individuals
with higher physical test scores had significantly fewer work-related injuries (Craig,
Congleton, Kerk, Amendola, & Gaines, 2006). These findings were similar to other manual
materials handling research in which freight workers who passed a preemployment
physical test had fewer days lost from work and were 1.7 to 2.2 times less likely to incur
an injury than their untested counterparts (Baker, Gebhardt, & Koeneke, 2001). This was
further supported in a study that demonstrated a reduction in injuries for truck drivers
and dockworkers (Gilliam & Lund, 2000).

For more than a decade the military experienced an increase in injuries and attrition due
to their personnel's physical fitness levels and ability to perform combat soldiering tasks.
It was shown that injuries in basic and advanced individual training had the greatest
impact on military readiness (Jones & Hansen, 2000). A series of studies were conducted
to evaluate the impact of varied levels of physical performance and injury reduction by
establishing the physiological factors (e.g., strength, aerobic capacity) associated with
the injury (Knapik et al., 2007).

In a study using injury and physical test validation data, higher physical test scores were
found to be significantly related to reduction in injures and lost work days for railroad
track laborers (Gebhardt & Baker, 1992). A utility analysis of these data showed that 67%
of the costs associated with injuries for these individuals were accounted for by the 20%
of the incumbents who would have failed the test battery (Baker & Gebhardt, 1994). The
annual utility of the physical performance test was estimated at $3.1 million dollars. In a
follow-up study, train service employees (e.g., brakemen, conductors) who passed
physical selection tests were compared to their counterparts hired without the
preemployment testing (not tested) during the same timeframe. The not tested group's
per injury costs were significantly higher ($66,148) than the tested group ($15,315) with
and without controlling for age and job tenure (Gebhardt & Baker, 2001). When
controlling and not controlling for age, job tenure, and year injured, the not tested
group's lost work days and injury rate were significantly higher than the tested group's.
In summary, the costs savings of evaluating the physical capabilities of applicants prior to
hire are substantial when considering the worker compensation and organizational costs
associated with injuries and lost work time.

Page 4 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Job Analysis for Physical Jobs


There are a variety of methods to determine physical job requirements. Although one
method can be used, it is the combination of methods that provides the data for the
development of accurate physical assessments for a target job. The methods include
gathering physiological, biomechanical, and working conditions data, along with
traditional job analysis data. It is not adequate to identify only the essential job tasks and
competencies, while ignoring the working environment. For example, the weight of the
equipment worn by firefighters (p. 276) and the impact of that equipment in sustained
performance at a fire must be included. Similarly, the type, duration, and workload
involved in the training required to perform the job must be considered. For example, to
become a police officer all candidates must complete strenuous academy training lasting
14 to 26 weeks. The physical training (e.g., restrain/subdue, handcuff) impacts the levels
of cadet performance and results in attrition for some candidates. A recent study of an 8-
week Army ranger training course showed attrition due to the high physical demands of
training (Nindl et al., 2007). Thus, if detailed, strenuous training is required prior to the
job, it must be considered in the job analysis phase.

The typical order for determining the physical requirements is to (1) conduct a job
analysis, (2) gather ergonomic, physiological, and biomechanical data, where appropriate,
and (3) determine whether the working conditions impact task performance. For example,
high levels of heat and occlusive work clothing result in decreased internal fluid levels
and reductions in oxygen transport, which impact the worker's aerobic and muscle
contraction (e.g., strength) capacity (Dorman & Havenith, 2009).

Identification of Essential Tasks

Most of the job analysis steps are similar to those used in identifying essential tasks,
knowledge, skills, and abilities (KSA) for cognitive tests. However, job observations are of
increased importance to becoming familiar with job task parameters related to the
equipment used and the sequences of task performance. The movement patterns used to
complete a task may vary across incumbents, especially for men and women. Due to
physiological sex differences, women may employ movement patterns different from men
to complete tasks (Courtville, Vezina, & Messing, 1991; Gebhardt & Baker, 1997;
Stevenson, Greenhorn, Bryant, Deakin, & Smith, 1996).

To provide greater detail for the tasks, ergonomic data (e.g., weights, forces, distance
walked) can be collected through direct measurement or from equipment specification
documents. The addition of ergonomic information to task statements provides the
specific information helpful in determining the physical demand of tasks. To clearly
identify the physical demand of a task, task statements need to address individual

Page 5 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

physical activities (e.g., handcuff a resistive individual). Task statements that are global in
nature (e.g., process a customer order) may include multiple physical activities (e.g., lift/
carry objects, operate forklift), thus not accurately defining the physical demand.

After generating a list of job tasks, typically a job analysis questionnaire is used to
identify the essential tasks and environmental working conditions. The task rating scales
may determine task frequency, importance, time spent, physical effort, or expected to
perform. When the purpose of a job analysis is to identify not only the essential job tasks,
but also the physical demands, specific information related to frequency and time spent
must be gathered. Therefore, the frequency scale should contain discrete (e.g., one to two
times per day) rather than relative (e.g., often) anchors. Similarly, the time spent rating
scale anchors should identify how long it takes to complete the task (e.g., 10 seconds, 5
minutes). Thus, the time spent and frequency ratings can be combined (e.g., frequency ×
time spent) to determine an overall task duration. For jobs with many tasks that are
performed infrequently, but are important to successful job performance, an expected to
perform scale can be used. For example, security officers at federal facilities are
responsible for defending the personnel from attacks and sabotage. For some officers
these events will not occur during their career. However, should the event occur, they
must be capable of responding. Thus, they are expected to be capable of performing the
tasks. Finally, a physical effort scale can be used to determine the physical demand of job
tasks and the overall job (Fleishman, Gebhardt, & Hogan, 1986). To determine if a job has
adequate physical demand to warrant applicant assessment, the overall rating mean
across all tasks or the number of tasks with ratings at or above a specified level can be
used to determine the level of physical demand or classify jobs by physical demand.

For the task ratings of physical job tasks to be completed accurately, individuals with
experience performing or directly observing the tasks need to complete the ratings. For
most jobs, incumbents are the best source for completing job analysis questionnaires. In
many situations, supervisors are not present when tasks are performed and cannot
provide the detail needed to complete the questionnaire.

To determine the essential tasks a variety of algorithms can be used, but are dependent
upon the nature of the job. For jobs with repetitive tasks, the frequency of task
performance can be used to determine essential tasks. For jobs in which tasks (p. 277)
are performed less frequently, but the consequences of error for those tasks are severe,
the importance ratings may be an effective way to determine essential tasks. For other
jobs, a combination of frequency, importance, and time spent ratings is used. One
combination uses a specified level of the task frequency (e.g., one to three times per
month) or importance mean rating. Another sums the frequency and importance raw or
standardized rating means and an a priori sum score (e.g., 0.00) is used as the cutoff to
identify essential tasks. A final combination standardizes (z-score) the frequency,
importance, and time spent ratings, sums the ratings, and compares the summed value to
a predetermined cutoff. The use of time spent ratings to identify essential tasks may be
useful for jobs in which tasks are performed infrequently, but the time duration to
complete the task is long or for jobs with substantial on-the-job training. Ratings of

Page 6 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

frequency and time spent on job and during training can be collected and used to
determine essential job tasks. Additionally, the standardized frequency ratings for
training can then be weighted by the amount of time spent training. Then this weighted
rating can be standardized (z-score) and combined with other task ratings (e.g.,
frequency on job, importance) to identify essential tasks. Regardless of the task rating
combination used, the method selected should be congruent with the nature of the job.

Finally, working condition and ergonomic questions should be included in the job analysis
questionnaire to provide information needed for test, criteria, and medical guidelines
development. For example, dragging a hose to a fire scene is a common essential task for
firefighters. However, to clarify the physical demand factors such as hose size and length,
drag distance, and status (charged and filled with water or uncharged and without water)
need to be determined. These data are one means of defining the physical demands of the
job tasks.

Quantifying Physical Demand


The physical demand of a job can be assessed from a relative standpoint or direct
measurement. The relative approach identifies the physical abilities required to perform a
job and the relative level of each ability in comparison to other jobs. The direct
measurement approach involves assessing factors such as aerobic demand or force
required to perform essential tasks.

Physical Ability Identification

Past research in exercise physiology and industrial-organizational (I/O) psychology


research has found different physical ability combinations that contribute to physical
performance (Baumgartner & Zuideman, 1972; Fleishman, 1964; Jackson, 1971; Myers,
Gebhardt, Crump, & Fleishman, 1993). Based on the findings from these studies, these
abilities are aerobic capacity (cardiovascular endurance), anaerobic power, muscular
strength (static strength), muscular endurance (dynamic strength), flexibility,
equilibrium, and coordination. The definitions of the physical abilities are shown in Table
13.1.

Factor analytic studies resulted in different physical ability models ranging from three-
component to six-component models (Hogan, 1991b; Jackson, 1971). Hogan's three-
component physical ability model consisted of muscular strength, endurance, and
movement quality based on data from workers in physically demanding jobs, whereas
others found six- and nine- component models that were based on a wider array of
physical performance (Fleishman, 1964; Hogan, 1991b; Jackson, 1971; Myers et al.,
1993). Due to the physiological determinants of performance, the six- and nine-
component models are more viable because they account for the different systems that

Page 7 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

impact physical performance. Hogan's structure collapses across these systems (e.g.,
muscular strength, muscular endurance). However, from a physiological standpoint, jobs
requiring continuous muscle contraction such as an order selector who loads products on
pallets for a 10-hour shift and lifts 23,000+ lb. per shift require muscular endurance, as
well as muscular strength. Because of the high muscular endurance demand for this job,
muscular strength and endurance need to be examined separately. Thus, a six- or seven-
factor structure as shown in Table 13.1 is typically used to identify the physical abilities in
the workplace.

To demonstrate the job relatedness of the essential job tasks to prospective physical tests,
an analysis that links the essential tasks to physical abilities is needed. All jobs require
some level of each of the seven abilities with the levels ranging from minimal to high. For
example, operating a computer keyboard requires a minimal level of muscular strength,
whereas manually loosening a frozen nut on a pipeline requires a high level of muscular
strength. In addition, most tasks require varied levels of multiple physical abilities. For
example, carrying 50-lb. objects up three flights of stairs requires high levels (p. 278) of
muscular strength and muscular endurance, but low levels of flexibility and anaerobic
power.

Table 13.1 Physical Ability Definitions.

Ability Definition

Aerobic Ability to utilize oxygen efficiently for activities performed for a


capacity moderate time period (e.g., 〉5 minutes) at a medium- to high-
intensity level.

Anaerobic Ability to utilize stored energy (e.g., ATP-PCr and ATP-PCr + lactic
power acid energy systems) to perform high-intensity activities for a rapid
time period (e.g., 5–90 seconds).

Muscular Ability of the muscles to exert force. The size of the muscle (cross
strength section) dictates the amount of force that can be generated.

Muscular Ability of the muscles to exert force continuously for moderate to


endurance long time periods (e.g., 〉2 minutes). The muscle fiber type (e.g.,
slow twitch) and chemical composition dictate the length of time
before a muscle reaches fatigue.

Flexibility Ability of the joints (e.g., shoulder, hip) to move in all directions
thus allowing rotation and reaching activities. The elasticity of the
ligaments, tendons, muscles, and skin influences the level of
flexibility.

Page 8 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Equilibrium Ability to offset the effect of outside forces (e.g., gravity, slippery
surface) and maintain the body's center of mass over the base of
support (e.g., feet).

Coordination Ability to use sensory and neurosensory cues to perform motor


activities that require a sequential pattern and monitoring multiple
external stimuli (e.g., dodging an oncoming object).

The level of physical abilities needed to perform the job can be determined through direct
measurements or ratings of job tasks. The Fleishman Job Analysis Survey provides a set
of nine Likert physical ability rating scales (Fleishman, 1995). An alternate physical
demands inventory that uses eight physical ability rating scales (e.g., muscular strength,
muscular endurance, anaerobic power, aerobic capacity) and behavioral anchors targeted
at work behaviors has also been used to define job demands (Gebhardt, 1984b). These
scales are used to rate essential tasks on the levels of various physical abilities needed for
successful job performance. These ratings can be completed by individuals
knowledgeable of the tasks and rating scales (incumbents, supervisors, job analysts). For
example, if the job task is “lift and carry items weighing 30–50 lb.,” the rater will use a
scale to rate the task on how much muscular strength is needed to perform that task. The
individual task ratings can then be combined to generate a physical ability profile for a
job. This profile shows the relative level of each ability needed for successful essential
task performance and provides the link between the essential tasks and the abilities. In
addition, the profiles across jobs can be compared to determine their similarity in terms
of physical demand. This comparison is important for generating a single test battery that
is appropriate for multiple jobs (Gebhardt, Baker, Curry, & McCallum, 2005).

Direct Measurements to Quantify Physical Demand

Direct measurements of the physical job tasks or components include basic ergonomic
assessments such as weights and dimensions of objects handled, distances walked or run,
heights of objects climbed, and heights of shelves. These measurements can be
incorporated into the job tasks to provide a clear statement of the task demands and
provide physiological measures of work performed when combined with frequency and
time spent ratings. More sophisticated measures include measuring the force needed to
move (push) objects, to remove/replace equipment parts, or to loosen and tighten bolts.
The effectiveness of direct measurements on determining physical demand varies by job
with equipment-intensive jobs (e.g., mechanic) yielding more objective measures than
less equipment-centric jobs (e.g., patrol officer).

For more complex movement patterns, biomechanical analysis that uses physics and
anatomy principles can be performed. Biomechanical analysis was used by the National
Institute for Occupational Safety and Health (NIOSH) to (p. 279) generate a mathematical
model to calculate the load limit for lifting (Ayoub & Mital, 1989; Walters, Putz-Anderson,

Page 9 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Garg, & Fine, 1993). A biomechanical analysis was also used for a paramedic job to
determine the forces needed to lift a patient-loaded stretcher and to generate the passing
score for a selection test (Gebhardt & Crump, 1984). Another biomechanical analysis
method involves filming the work activity (e.g., pole climbing) and determining the forces
at the joints (e.g., knee) incurred during the movement. These types of biomechanical
analyses have been used to identify types of job movements that involve risk of injury
(Gebhardt, 1984a).

Physiological responses to work have been measured by assessing heart rate (HR)
response, oxygen uptake rate, rise in core body temperature, or lactate buildup during
work activities. HR was used to estimate the aerobic intensity associated with manual
materials handling jobs that required continuous lifting to palletize products for shipment
to stores (Gebhardt, Baker, & Thune, 2006). It was determined that the selectors were
working at 71–81% of their age adjusted maximum HR [activity HR/(220 - age)]. This level
of work is classified by the American College of Sports Medicine (ACSM) as hard and is
very difficult to maintain over an 8-hour shift without work breaks (Thompson, Gordon, &
Pescatello, 2010). These data, coupled with validation data, were used to identify the
passing score for an entry-level test. This methodology was also used in a study of
military tasks that showed that the HR demand during soldier patrolling activities
corresponded to a simulated loaded march on a treadmill (Williams, Rayson, & Jone,
2007).

The aerobic capacity level for most jobs is low and does not warrant physical testing. For
jobs with higher aerobic demands, the energy costs of the aerobic tasks can be
determined by oxygen uptake measurements (VO2) (Bilzon, Scarpello, Smith, Ravenhill, &
Rayson, 2001; Sothmann, Gebhardt, Baker, Kastello, & Sheppard, 2004). This type of
research has been conducted for firefighter positions in different environments (urban,
forest, ship board) (Gledhill & Jamnik, 1992a; Sothmann, Saupe, Jasenof, Blaney,
Donahue-Fuhrman, Woulfe, et al., 1990; Sothmann et al., 2004). Results of this research
found that the energy expenditure (VO2) to perform essential firefighter tasks ranged
from 33.5 to 45.0 milliliters of oxygen/kilogram of body weight/minute (ml kg-1 min-1). The
resulting measures were used to support the use of aerobic capacity selection tests and
establish passing scores for the tests.

In the warehouse industry, employers gather data related to the order size, item location,
item weight, distance moved during order processing, and order completion time for each
order a worker completes. These variables can be used to derive measures of
physiological work, along with providing information about workers that directly reflects
their productivity. These data can form a basis for setting test passing scores.

Environments That Affect Physical Performance

Page 10 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Environmental aspects such as temperature, heat, and occlusive clothing influence the
quality of physical performance. Working in high temperatures (e.g., 〉90º) reduces
productivity. Research has shown that workers with higher aerobic capacity had higher
productivity levels in heated environments than individuals with lower levels of aerobic
capacity (Astrand, Rodahl, Dahl, & Stromme, 2003). Low temperatures can increase the
physical demands of selected tasks (e.g., coupling rail car air hoses) by either making the
equipment less pliable or making it necessary for workers to wear layers of clothing. The
protective clothing and personal protection equipment (PPE) worn by workers also
adversely impact task performance (Kenney, Hyde, & Bernard, 1993). These data can be
obtained through a review of the weather history, taking the temperature in the work
area, incumbent focus groups, job analysis questionnaires, and company operating
procedures. Identifying the environmental factors that affect performance not only
defines the impact on physical demand, but also provides information for medical
personnel to use when evaluating suitability for a job.

Types of Physical Tests


There are two types of physical tests used in the employment setting: basic ability or
physical ability tests and job simulations or work sample tests. Physical ability tests
assess the basic level of fitness in relation to a specific ability (e.g., aerobic capacity,
muscular strength). Job simulations are designed to replicate work tasks. Both types of
tests have been shown to be valid predictors of job performance.

Basic Ability Tests

Basic ability tests assess a single ability and provide for a setting in which the movements
in the (p. 280) test are limited to specific body parts (e.g., upper extremities), which
reduces the potential for an injury. Because the tests are based on the physiological
components (e.g., muscular strength, aerobic capacity) at an ability level, they can be
used for multiple jobs requiring a specific ability.

Aerobic capacity tests measure the ability of the lungs, heart, and blood vessels to
process oxygen for use in a maximum bout of exercise or work. Maximum aerobic
capacity (VO2 max) is assessed by subjecting an individual to defined increments of
increasing workloads. VO2 max can be measured using a treadmill or bicycle ergometer by
adjusting speed, resistance, and slope to increase the workload. The Bruce and the Balke
protocols are the most commonly used treadmill protocols (Thompson, Gordon, &
Pescatello, 2010). A regression equation is used to determine the level of aerobic capacity
or oxygen uptake value in milliliters of oxygen per kilogram of body weight per minute
(i.e., ml kg-1 min-1), or the time to examinee exhaustion. A maximum aerobic test requires

Page 11 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

the presence of a physician. Therefore, submaximal tests are typically used in the
employment setting.

Submaximal tests provide an estimate of VO2 max by using heart rate response to the
workload. Step tests and bicycle ergometer tests (e.g., YMCA or Astrand-Rhyming bike
tests) are used to obtain an estimate of VO2 submax by monitoring heart rate prior to and
after the test and using regression equations to determine the relationship of the
preexercise and postexercise heart rates (Astrand et al., 2003; Golding, 2000). Since the
promulgation of the Americans with Disability Act of 1990 (ADA), these tests and the
maximal test can be given only after a conditional offer of a job because monitoring heart
rate is considered a medical assessment. Thus, if an organization desires to administer an
aerobic capacity test prior to a conditional job offer, tests such as the 1.5-mile run, 1-mile
walk, or a step test completed at a set cadence (e.g., 96 step/minute) for a specific
duration (e.g., 5 minutes) without heart rate are used.

Anaerobic power tests have a short duration (e.g., 10 seconds) and involve the use of
stored energy, as opposed to aerobic capacity tests that evaluate the ability to process
oxygen to generate energy. Many jobs require anaerobic power. For example, most police
foot chases last approximately 30 seconds (Baker, Gebhardt, Billerbeck, & Volpe, 2008;
Gebhardt, Baker, & Phares, 2008). Tests of anaerobic power include the 100-Yard Run,
Margaria Test, 10-second Arm Ergometer Test, and Wingate Anaerobic Test. The
Margaria Test involves sprinting up 12 stairs with timing devices placed on the 8th and
12th steps (McArdle, Katch, & Katch, 2007). The time between the 8th and 12th steps is
used to compute the power output [P = (body weight in kg × 9.8 × vertical height in
meters)/time]. There are adaptations of this test (e.g., Margaria Kalaman Power Test), but
the premise is the same. The 10-second Arm Ergometer test involves cranking the pedals
of an arm ergometer, which is set at a high resistance (similar to the highest gear on a
bicycle), as fast as possible (Gebhardt & Baker, 1992). The score is the number of
revolutions completed. The Wingate Anaerobic Test assesses peak anaerobic power,
anaerobic fatigue, and total anaerobic capacity (Inbar, Bar-Or, & Skinner, 1996). This test
involves pedaling a bicycle ergometer for 30 seconds with a resistance level of 0.075 kg
per kilogram of body weight of the examinee. Four measures can be derived from data
gathered during the test: (1) peak power (number of revolutions × flywheel distance in
first 5 seconds), (2) relative peak power (peak power/body weight), (3) anaerobic fatigue
([highest 5-second peak power/lowest 5-second peak power] × 100), and (4) anaerobic
capacity (∑ 5-second peak power over 30 seconds).

In the world of work, muscular strength and muscular endurance are the two abilities
that are most common to success in physically demanding jobs. Muscular strength tests
can be classified as isometric, isotonic, and isokinetic. Isometric or static strength tests
involve maintaining the joint(s) at a predetermined degree of flexion (e.g., 90°) and
producing a maximal muscle contraction (Astrand et al., 2003; McArdle et al., 2007). For
example, in the arm lift test an individual stands on a platform with the arms next to the
torso and the elbows flexed to 90° (Chaffin, Herrin, Keyserling, & Foulke, 1977). A bar,
connected to the platform, is placed in the hands and the individual is instructed to exert

Page 12 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

a maximum force in an upward direction. The score is the force generated by applying
pressure to the immovable bar. Isometric tests have been used in the employment setting
to measure shoulder, arm, trunk, grip, knee, and leg strength (Blakely, Quinones,
Crawford, & Jago, 1994; Baumgartner & Jackson, 1999; Gebhardt et al., 2005).

Isotonic tests entail movement of a joint(s) through a range of motion, thus resulting in
concentric (shortening) and eccentric (lengthening) movement of the muscle fibers
(p. 281) (Astrand et al., 2003; McArdle et al., 2007). These actions can be observed in any

activity in which a weighted object is lifted from and lowered to the ground. Isotonic tests
can be used to measure muscular strength and muscular endurance. To evaluate
muscular strength the isotonic test must involve a resistance that can barely be
overcome. For example, a one-repetition maximum bench press uses a weight on the
barbell that the individual can just push to full arm extension. When used to measure
muscular endurance, the resistance or workload is lowered (e.g., lower weight) and the
duration is increased, allowing for multiple repetitions of a movement. The YMCA Bench
Press Test uses a lighter weight that is pressed to a 30-lifts per minute cadence to
measure muscular endurance (Golding, 2000). The test is terminated when the individual
can no longer maintain the cadence. Similarly, the arm endurance test requires pedaling
an arm ergometer for a specified time period (e.g., 2 minutes) at half the workload of the
arm power test mentioned above.

Isokinetic testing combines characteristics of isometric and isotonic assessments by


combining movement at a preset speed and range of motion. During isokinetic testing,
the limb (e.g., arm) experiences substantial resistance during the flexion and extension
movements. In isotonic testing resistance occurs in one direction (e.g., flexion). Isokinetic
testing is typically completed for the torso, shoulders, elbows, or knees. This type of
testing requires computerized equipment that controls the speed (degrees/second) of
movement. Its measurement unit is the torque (τ) or angular force generated by rotating
a limb (e.g., leg) about an axis (e.g., knee joint), which results in a torque curve
(McGinnis, 2007). A cumulative score is generated across the joints tested to produce a
strength index. Isokinetic tests were originally used for strength training and later
evolved for use in injury evaluation and employment assessment (Gilliam & Lund, 2000).

Flexibility and equilibrium are factors involved in a variety of jobs (e.g., longshoreman,
line worker). Studies have shown that these two abilities are usually not significant
predictors of job performance unless high levels of the ability (e.g., lash containers at a
height of 40+ feet) are required (Gebhardt, Schemmer, & Crump, 1985). The sit and
reach test (seated, legs straight, reach forward) and the stabilometer test (balance on a
platform with a center fulcrum) have been found to be related to job performance.
However, low correlations between test and job performance (r = 0.00 to 0.18) are
generally found (Baumgartner & Jackson, 1999; Gebhardt, Baker, & Sheppard, 1998;
Gebhardt et al., 2005).

Page 13 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Basic ability tests have several advantages. They assess a single ability, but can be used
to assess multiple abilities by adjusting the workload and/or duration (e.g., muscular
strength–arm power; muscular endurance–arm endurance). The tests can be used to
evaluate multiple jobs that require a specific ability. Basic ability tests can be set up in a
small area and easily stored when not in use. Furthermore, due to the controlled nature
of these tests, the probability of injury during testing is limited. The disadvantage of basic
ability tests is that they do not resemble the job and thus lack face validity. Additional
listings of basic ability tests are located in other studies (Hogan, 1991a; Landy et al., 1992;
McArdle et al., 2007).

Job Simulation Tests

Job simulations (work samples) provide the face validity not found in basic ability tests.
However, there are limitations to job simulations. Typically, they can be used for only one
job, whereas basic ability tests can be used for multiple jobs. The Equal Employment
Opportunity Commission (EEOC) Uniform Guidelines are explicit in the criteria for
developing simulations (Equal Employment Opportunity Commission, 1978). Critical or
essential tasks may be simulated, but skills learned on the job or in training may not be
incorporated into the test. Job simulations have been used primarily for selection in
manual materials handling and public safety jobs. Two forms of simulations have been
used for manual materials handling jobs. The first involves lifting and carrying weighted
objects similar to the weights of products encountered on the job (Gebhardt et al., 1992).
These tests are scored by identifying the number of objects moved in a specific time
frame or the time to move a set number of objects. The distance the objects are carried
depends upon the results of ergonomic assessment during the job analysis. A second
format for lifting tests dictates the pace (e.g., every 6 seconds) and the height of the lift
and is entitled isoinertial lifting (Mayer, Barnes, Nichols, Kishino, Coval, Piel, et al.,
1988). The weight is lifted to multiple heights (e.g., shoulder, waist) and is increased until
the individual can no long keep up the defined pace or lift the weight. Progressive
isoinertial lifting tests provide an effective assessment of lifting capacity (Mayer, Gatchel,
& Mooney, 1990; Stevenson, Andrew, Bryant, Greenhorn, & Thomson, 1989). (p. 282)
Furthermore, they are an inexpensive screening measure for manual materials handing
jobs and have been found to be predictive of job performance and injuries (Gebhardt,
Baker, & Thune, 2006; Lygren, Dragesund, Joensen, Ask, & Moe-Nilssen, 2005; Schenk,
Klipstein, Spillmann, Stroyer, & Laubli, 2006).

Job simulations are employed frequently for public safety jobs. A firefighter simulation
may include (1) stair climbing, (2) hose drag, (3) equipment carry, (4) ladder raise, (5)
forcible entry, (6) crawling during a search, (7) dragging a victim, and (8) pulling ceiling.
Regardless of the type of job simulation (e.g., lifting, pursuit of suspect), the intensity at
which the simulation is performed, distance walked/run, duration, or number of objects
handled should mirror the job demands and the order in which the events are performed

Page 14 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

on the job. Fidelity with the actual job tasks is important to the legal defensibility of the
job simulation.

When selecting a scoring system for a job simulation, it is important to consider the
criteria that underlie effective job performance (e.g., emergency response, productivity).
Some jobs have a specific number of task iterations that must be performed (e.g.,
assembly line), whereas others require a fast response (e.g., chase a suspect). An
example that includes productivity is in the longshore industry where workers lash/
connect and unlash containers to the deck of a ship using long rods (e.g., 16 ft.) that
weigh up to 51 lb. and turnbuckles (40 lb.). After the rod is hung, the turnbuckle is
attached and tightened to a torque value of 100 ft/lb. The job analysis showed that 120+
rods were hung by a longshore worker 45% of the time during an 8-hour period
(Gebhardt & Baker, 1997). The test consisted of hanging rods in a corner casting located
10 and 20 feet above the ground and taking them down. On the job, longshore workers
must complete the lashing task quickly because of the high costs of a ship sitting in port
(∼$75,000+/day). Due to the quick pace of this job function, the scoring system selected
was time to complete the task.

Job simulations have several advantages ranging from face and content validity to
enabling the employer to confirm that the applicant can perform a segment of the critical
job tasks. However, disadvantages to job simulations include difficulty in generating a
meaningful scoring system, the need for a larger testing site, portability, and an increased
potential for injury. Furthermore, job simulations typically require a substantially larger
test area and more equipment (e.g., boxes, platforms, arrest simulator) that results in
higher administration and storage costs. In addition, unlike basic ability tests that control
movement, job simulations have a higher potential for injury (e.g., slip and fall in a
pursuit simulation). Therefore, it is important to consider all the factors related to the
potential tests (e.g., scoring, implementation, equipment storage) prior to selecting or
designing a basic ability or job simulation test.

Parameters Related to Test Design or Selection


Similar to cognitive tests, test reliability and adverse impact are important when
designing or selecting a physical test. The test–retest reliability of job simulations such as
a pursuit run (0.85), maze crawl (0.76), lift/carry simulations (0.50–0.57), and pole
climbing (0.79) was comparable to basic ability tests that ranged from 0.65 to 0.95
(Baker, Gebhardt, & Curry, 2004; Gebhardt et al., 1998; Myers et al., 1993). In general,
the reliability of job simulations range from 0.50 to 0.92 for jobs in the public and private
sectors (Gebhardt et al., 1998; Jackson, Osburn, Laughery, & Vaubel, 1992).

Adverse impact by sex is most predominant in physical testing followed by age and race
and national origin (RNO). The male–female physiological structure (e.g., lean body mass,
percent body fat, height, weight) contributes to large test and effect size differences

Page 15 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

(e.g., 〉1.0) by sex for both basic ability and job simulations. Past research has
demonstrated that sex differences are most pronounced for tests involving strength,
aerobic capacity, and anaerobic power (Blakely et al., 1994; Gebhardt et al., 2005;
Gebhardt, 2007; Hogan, 1991a). Although some research has found fewer test score
differences when controlling for physiological differences such as lean body mass (Arvey,
Landon, Nutting, & Maxwell, 1992), this approach will not avert sex differences and may
not meet legal scrutiny. Similar to physiological research, women perform similar to men
on tests of equilibrium and flexibility (e.g., maze crawl, stabilometer) (Gebhardt & Baker,
1997).

Currently there is an influx of older workers (〉40 years old) into physically demanding
jobs due in part to the longer life span of the U.S. population and economic issues. The
physiological literature is replete with studies showing decrements in physical
performance with age (Akima et al., 2001; McArdle et al., 2007). Strength declines up to
15% by decade with greater decreases for 50 to 70 year olds (p. 283) (Lynch et al., 1999).
Similarly, maximum aerobic capacity (VO2max) declines 0.4–0.5 ml kg-1 min-1 per year due
to a decrease in cardiac output and stroke volume of the heart (Bortz & Bortz, 1996). This
result can lead to a decline in VO2max of 9–10% per decade after age 30 (Joyner, 1993).

Two approaches are used to address sex and age differences in the employment setting.
The first is participation in an organized physical fitness program to increase strength
and VO2max. The second is to use statistical procedures (e.g., differential prediction) to
examine whether the mean differences influence test fairness (Bartlett, Bobko, Mosier, &
Hannan, 1978). In most cases physical performance tests are equally predictive across
sex, RNO, and age groups (Gebhardt & Baker, 2010).

A very large study of employment data for over 50,000 men from blue collar and public
safety jobs found significant test score differences with effect sizes ranging from 0.29 to
0.52 between whites and African-Americans (Baker, 2007). These differences showed that
whites performed better on tests involving continuous and/or quick movements (e.g.,
pursuit run, arm endurance) and they performed significantly better than African-
Americans and Hispanics on a test of aerobic capacity (1.5 mile run). However, whites
and African-Americans outperformed Hispanic men (Baker, 2007; Blakely et al., 1994) on
tests of muscular strength.

The goal in physical testing is to select or design a test that has less adverse impact than
other options. It should be noted that job simulations typically will not have less adverse
impact than basic ability tests. This occurs because the actual job tasks normally require
handling and pushing heavy objects (e.g., lift an 80-lb. box of meat from a 60-inch-high
rack to a pallet). To ensure that tests with less adverse impact are used, the literature
should be reviewed and a pilot of all new tests should be conducted to eliminate
movements that could be compensated for by the use of equipment or alternate
techniques.

Page 16 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Finally, the safety of the applicant and the logistics related to test set-up and
administration must be taken into account. In job simulation tests, ensuring that the
testing area (e.g., floor surface, distances to a wall) is safe requires more maintenance
and set-up effort than basic ability tests because of the greater number of components
(e.g., stairs, fences, sleds, simulators). Basic ability tests have a more controlled setting
with less movement. Because both types of tests are valid and can be administered safely,
selection of a test should focus on the reduction of adverse impact and reliability.

Physical Test Validity

Page 17 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

From an employment perspective the types of validity evidence required are defined by
the Uniform Guidelines (EEOC, 1978) and other testing publications [American
Educational Research Association et al., 1999; Society for Industrial and Organizational
Psychology (SIOP), 2003]. Each of these publications stresses the need for both
background theory and evidence to support the relationship between the test and job
performance. The validity evidence of performance on physical tests is found in many
disciplines in relation to job functioning, injury reduction, and disease prevention. In the
employment area many studies used criterion-related validity not only to document test
validity, but to provide data for identifying minimum passing score(s).

Past research demonstrated that the relationship between physical test scores and job
performance measures (e.g., work samples, supervisor/peer ratings, productivity data)
ranged from low to high (r = 0.01–0.81) depending upon the criterion measure used
(Arvey et al., 1992; Blakely et al., 1994; Gebhardt, 2000). The very low correlations were
found for flexibility and equilibrium measures. When the criterion measure was
supervisor and/or peer ratings, the simple validity coefficients for basic ability tests
ranged from 0.02 to 0.79 (Blakely et al., 1994; Gebhardt & Baker, 2010). When basic
ability test scores are compared to work sample criterion measures, the simple validities
ranged from 0.01 to 0.81 (Gebhardt & Baker, 2010). Tests used to evaluate shoulder, arm,
torso, and leg strength were found to have the highest validity coefficients (r = 0.39–0.81)
with isometric and isotonic tests being higher than isokinetic tests (Baumgartner &
Jackson, 1999; Blakely et al., 1994; Gebhardt & Baker, 2010).

Job simulations have been used in a variety of criterion-related validity studies and were
found to be related to supervisor/peer and job productivity measures. When supervisor/
peer assessments were used in manual materials handing simulations, the validity
coefficients for the simulation ranged from 0.37 to 0.63 (Anderson, 2003; Gebhardt &
Baker, 2010). A series of firefighter studies established the aerobic capacity of firefighter
job tasks (e.g., pulling ceiling, drag hose) and the tests most predictive of job
performance (p. 284) (Sothmann et al., 1990; Sothmann, Saupe, Jasenof, & Blaney, 1992;
Sothmann, Gebhardt, Baker, Kastello, & Sheppard, 1995; Sothmann et al., 2004). These
studies established the minimum level of oxygen uptake (VO2max) required to perform
firefighting activities (33.5 ml kg-1 min-1) and the relationship (R = 0.70) between a
battery of strength and aerobic capacity tests with job performance (Sothmann et al.,
2004). Similar validity coefficients (r = 0.45–0.87) for job simulations were found for
other jobs (e.g., law enforcement) that used both physiological and criterion-related
validity approaches (Anderson, 2003; Gebhardt, Baker, & Sheppard, 1999a; Baker,
Gebhardt, Billerbeck, & Volpe, 2009). One study used biomechanical modeling to
determine the force to lift a patient-loaded gurney (Gebhardt & Crump, 1984). A later
validation study found the force to perform a dynamic lift was within 2 pounds of the
biomechanical model force. These data and Sothmann's data demonstrate the value of
determining the actual parameters (e.g., force) related to essential task performance.

Page 18 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

One of the advantages of job simulations is the ability to use job analysis results to
establish content validity. A disadvantage of the content validity approach is the lack of
data to assist in setting a passing score. This difficulty emphasizes the importance of
gathering measurements of the workplace and worker (e.g., aerobic capacity) if a
criterion-related validity study is not conducted.

Alternative Validation Methods

Although it is desirable to conduct a criterion-related validation study to validate physical


performance tests, it may not be feasible for all jobs and/or organizations. Conducting
such a study for physical performance tests is time and labor intensive. Typically, 3–4
hours of participant time is needed to complete the validation data collection. Inability to
recruit an adequate sample size is the prime reason a criterion-related validity study is
not conducted. Using power analysis tables, to obtain an observed power greater than
0.90 for a four-test battery with an expected multiple R of 0.50, a sample size of
approximately 100 subjects is needed.

Alternate validation strategies are available to provide organizations with criterion-


related validated tests without conducting a local validation study. These alternate
validation strategies are test transportability, job component validity (JCV), and synthetic
validity. All three strategies use validation information from prior research.

Test Transportability.
Test transportability pertains to a test's validity evidence being transported from a job in
one organization to the same or similar job in another organization (Gibson & Caplinger,
2007). Test transportability extends criterion-related validity to unstudied job(s). This
validation method can be used to transport a single test or battery of tests. The steps to
conduct test transportability for physical performance tests are similar to the steps used
to transport other assessments. The EEOC Uniform Guidelines (1978) has accepted test
transportability as a validation approach if the following four conditions are met:

1. The incumbents in the “new job” must perform substantially the same or similar
work behaviors as incumbents in the original job;
2. Criterion-related validity evidence is present that demonstrates the test validity;
3. The test is fair to protected groups; and
4. Determination if there are other variables (e.g., work methods) that may affect
validity.

The EEOC Uniform Guidelines (1978) indicates that work behaviors are defined as
activities performed to achieve the objectives of the job and the similarity must be
established through job analysis. Thus, similarity can be determined on the basis of
essential tasks or measurable KSAs derived from appropriate job analyses (Gibson &
Caplinger, 2007). Methods to determine job similarity include overlap of essential tasks,

Page 19 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

correlations between job ratings, distance statistics between job ratings, and overlap of
KSAO requirements. When the job similarity is established and the four EEOC conditions
have been met, the test and its validation evidence can be transported to the target job.

Test transportability using essential tasks (Fried v. Leidinger, 1977) and required abilities
(Bernard v. Gulf Oil Corporation, 1986) has been upheld by the courts for nonphysical
assessments. There are no legal cases related to test transportability and physical
performance tests. However, this method was used to transport physical performance test
batteries. For example, a physical test battery validated for selection and retention of
nuclear security officers was transported to other nuclear security officer jobs at other
generating stations (Baker & Gebhardt, 2005a). The similarity of the two jobs was
determined by essential task overlap and physical ability profiles.

Job Component Validity (JCV).


JCV infers a test's validity without local empirical validation (p. 285) evidence for a job
from past validation research (Hoffman, Rashkovsky, & D’Egido, 2007). Unlike test
transportability, JCV provides test validity evidence for a job from other jobs with
different tasks. The basis of JCV is (1) jobs requiring the same components need the same
abilities for effective performance and (2) the validity of a component assessment is
consistent (Jeanneret, 1992). To use JCV for physical tests, job analysis information is
needed for the new job and archive jobs, along with numerous validation studies that
demonstrate test validity for the component of interest.

The Position Analysis Questionnaire (PAQ) and test validity information have been used to
generate physical test batteries for jobs with similar demands. PAQ scores were used to
identify the physical ability requirements for job families (Hoffman, 1999). This
information was combined with data from a validation study to support the use of
physical tests for jobs not included in a previous study. Hoffman, Rashkovsky, and D’Egido
(2007) indicated that the PAQ scores in the 1999 study could be used to accurately
predict the test scores for the job studied.

Synthetic Validity.
Synthetic validity is similar to JCV in that both are based on the abilities needed to
perform a specific job component and the presence of consistent test validity for the
component of interest. The difference between synthetic validity and JCV is that synthetic
validity computes validity coefficients using job analysis and validity data and JCV
predicts validity coefficients (Johnson, 2007). Although there are no published synthetic
physical test validity studies, recent research using a large physical test database was
conducted to generate a valid test battery (Baker et al., 2009).

To conduct a synthetic validation for physical tests, job analysis data, archival basic
ability test scores, and archival measures of job performance are needed. The archival
data (e.g., test scores) are gathered from similar or different jobs that require the ability
assessed by the test. Job analysis is used to identify the abilities associated with the
essential tasks for the new job. Next, the physical tests that assess the required abilities

Page 20 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

are identified. Archival test scores and job performance measures for the validation of
study participants for the tests of interest are extracted from the large database to a
synthetic test database. This synthetic database is used to generate test batteries using
various statistical procedures (e.g., regression) and to determine test fairness. This
procedure results in valid physical test batteries that assess the physical abilities needed
by the new job. The synthetic approach was also used to supplement validation data when
an adequate sample could not be attained from the source organization (Gebhardt, Baker,
& Sheppard, 1999b).

Page 21 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Scoring Physical Performance Tests


The format utilized to score individual physical performance tests depends upon the type
of test or job simulation being used. Measurement units for basic ability tests include
number completed (e.g., sit-ups), pounds or kilograms of force applied, time to complete
the test, VO2submax, or successful test completion. Job simulation units typically include
time to complete the simulation or number completed. Regardless of the measurement
unit, the scoring must be objective. Past litigation has shown that subjective evaluations
of candidate performance (e.g., used proper lifting technique) are not acceptable
assessments of applicant performance (EEOC v. Dial Corp, 2006). If specific test
performance criteria are required (e.g., arms must be fully extended), the correct and
incorrect criteria need to be demonstrated to applicants.

Different approaches are used to score physical performance test batteries. The most
common are the multiple hurdle (passing score for each test) and compensatory (sum test
scores) approaches. A third approach combines the compensatory and multiple hurdle
models. With the multiple hurdle approach, an applicant must achieve or exceed the
passing score on each test to pass the battery. The compensatory approach uses a
weighted or unweighted combination of the individual test scores to generate an overall
score that must be achieved to pass the test battery. There are different methods to
combine test scores for the compensatory scoring approach. Some researchers used a
simple raw test score sum. However, a simple raw score sum is problematic when the
measurement units are not the same (e.g., pounds, revolutions) or the range of the units
is considerably different. When the range of the measurement unit is different across the
tests (e.g., arm lift 30–130 lb.; trunk pull 150–450 lb.), using an unweighted sum results
in inadvertently giving more influence to the test with the larger measurement units.

To ensure that each test is weighted appropriately, regression analysis, unit weighting of
standardized scores, and assigning point values for specific test score ranges (e.g.,
stanine) are used. For the standard score unit weighting approach, each test score is
standardized before combination. This ensures (p. 286) that each test is weighted
approximately equally (equality of weights also depends on the intercorrelaton of
predictors; only when uncorrelated does standardizing achieve unit weights) in the
combined score. If standard score unit weighting is used, adequate samples of test scores
are needed to ensure a representative distribution of performance. Use of regression
analysis not only allows for weighting of the tests within a battery in accordance with its
level of prediction, it also accommodates different test measurement units. The point
value approach converts the scores for each test in the battery to a point value after
which the point values are summed to yield an overall test score. For some test batteries,
the point value ranges for each test are the same to allow for equal contribution of each
test in the battery (Gebhardt & Baker, 2007). For other test batteries, point value ranges
incorporate a weighting factor (e.g., regression results). Regardless of what type of point

Page 22 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

values and ranges are generated, they must be based on the distribution of test scores
(e.g., percentile, stanine scores, standard error of the difference) (Cascio, Outtz, Zedeck,
& Goldstein, 1991).

Comparison of the compensatory and multiple hurdle scoring approaches found that, in
general, less adverse impact by sex is found for the compensatory approach than the
multiple hurdle approach (Gebhardt, 2000; Sothmann et al., 2004). A disadvantage of the
compensatory approach is that some individuals can compensate for extremely low scores
on one test with extremely high scores on another test and pass the test battery. This
typically occurs for men who perform well on muscular strength tests and poorly on tests
of other abilities (e.g., aerobic capacity). To alleviate this problem, multiple hurdle and
compensatory models have been used together in one scoring approach (Gebhardt &
Baker, 2007). For this combined approach, baseline or minimum scores that cannot be
compensated for by other tests are established for each test in the battery. The baseline
scores prevent an individual who scores extremely low on one test from passing the test
battery. To pass a test battery an individual must meet or exceed the baseline scores for
each test and the overall summed score.

Another use of the combined multiple hurdle and compensatory approach occurs when all
tests in the battery are not administered at the same time. If a test battery is given prior
to a conditional job offer, any test in the battery that collects medical measures (e.g.,
heart rate) must be given after the conditional job offer (ADA, 1990). Therefore, passing
scores can be established that allow for part of the test battery to be given at a later date.
This scoring approach was used for a firefighter test in which applicants completed the
first part of the battery in a compensatory format, then completed the final test (VO2max)
after the job offer (Sothmann et al., 2004).

Setting Passing Scores


Physical performance test passing scores must reflect minimally acceptable job
performance, be linked to the physical demands of the job, and be “reasonable and
consistent” with proficient job task performance (EEOC, 1978). Due to physiological
differences between men and women such as height, weight, and lean body mass, it is
common to find significant test score differences. The test score differences are
especially large for tests of muscular strength, and these differences may result in
differential passing rates for men and women. Because most test batteries will have
adverse impact against women, it is imperative that the passing scores reflect effective
and safe levels of performance. Two types of passing scores, criterion-referenced and
norm-referenced, have been used for physical tests (Landy & Conte, 2007; Safrit & Wood,
1989).

Page 23 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Criterion-referenced passing scores are more commonly used in the employment setting
because they are based on physical test and job performance data. These validation data
are used to generate expectancy tables, contingency tables, and pass/fail rates that are
used to establish passing scores that maximize prediction effectiveness by maximizing
true-positive and true-negative decisions and minimizing false-positive and false-negative
decisions. Criterion-referenced passing scores can be set using data from incumbents
(concurrent) or candidates (predictive).

Additional information such as ergonomic and physiological data should be used to


generate physical test passing scores. The weights of objects that employees need to lift
and carry can be used to help establish passing scores. For example, if the physiological
data demonstrated that a specific level of aerobic capacity is required to perform
firefighting tasks, that level should be used as the passing score (Gebhardt, Baker, &
Thune, 2006; Gledhill & Jamnik, 1992b; Sothmann et al., 2004). Similar data were used
when the aerobic demands were calculated for a security officer response (e.g., 90+
stairs, 400 yards) and the VO2submax for the response was assessed using a treadmill
(Baker, Gebhardt, & Curry, 2004).

Biomechanical and ergonomic data have also been used to set passing scores.
(p. 287)

For longshore, oil refinery, and paramedic jobs, the forces to complete tasks (e.g., tighten/
loosen valves/turnbuckles, lift patient loaded gurney) were measured. These forces were
then used to establish passing scores for physical performance tests (Gebhardt & Baker,
2010; Jackson et al., 1992). Similarly, pacing data have been used to assist in the
identification of passing scores for jobs with a single task or a series of tasks for time-
sensitive jobs (e.g., firefighter, assembly line jobs), but where performing the tasks too
quickly may impact safety and successful job performance. Sothmann et al. (2004) used
videotaped firefighters completing a job simulation at different paces and had incumbent
firefighters determine whether each pace was acceptable or unacceptable. The slowest
acceptable pace and other validation data were used to identify the passing score.

Norm-referenced passing scores are set using a distribution of test scores with
corresponding percentile ranks from a known population or sample with a percentile
being selected as the passing score. These scores are used primarily for law enforcement
jobs and are usually set at the 40th percentile for selection and 50th percentile for
academy graduation. The norm-referenced passing scores are sometimes set separately
for sex and/or age. No tests with sex- and/or age-normed passing scores were found in
the published research for jobs outside of law enforcement. In the 1980s and 1990s the
Employment Litigation Section (ELS) of the Department of Justice recommended using
sex-normed passing scores (Ugelow, 2005). The rationale was that sex-normed and/or
age-normed passing scores were acceptable because new hires would receive additional
physical training at the academy and both normed groups have the same level of fitness
in terms of percentile.

Page 24 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

The passing scores are set at the same percentile (e.g., 40th percentile) using separate
normed test data for each group (e.g., sex, age). If the passing scores are set at the 40th
percentile, the passing scores for sit-ups would be 38 for men 20–29 years old and 32 for
women in the same age group. However, use of multiple passing scores for a protected
group violates the Civil Rights Act of 1991, which states that passing scores cannot vary
based on sex, race and national origin (RNO), or age. Proponents of multiple passing
scores argue that the Civil Rights Act of 1991 does not apply to physical tests or that they
are measuring “fitness.” However, if the test is being used to make any employment
decision (e.g., selection, retention, promotion), compliance with Civil Rights Act of 1991
and other federal and state statutes is required. Fitness tests assess an individual's
fitness level. Although it appears that sex- and/or age-normed passing scores violate the
Civil Rights Act of 1991, legal cases have upheld and denied the use of these passing
scores for selection and retention (Alspaugh v. Michigan Law Enforcement Officers
Training Council, 2001; Badgley and Whitney v. Walton, Sleeper, Commissioners of Public
Safety, and Vermont Department of Public Safety, 2010; Peanick v. Reno, 1995). In the
Alspaugh case, norm-reference passing scores were upheld because the tests were
assessing fitness and not job requirements. However, in either case little or no mention
was made of how the percentile passing scores and the different raw scores were related
to job performance.

A drawback to norm-referenced passing scores is that the passing score does not
correspond to minimally acceptable job performance. Furthermore, this issue is
compounded when sex- and age-normed passing scores are used. To comply with EEOC
Guidelines (1978), it must be demonstrated that each passing score (e.g., women 20–29,
men 20–29) reflects a minimally acceptablle job performance. In the Lanning v.
Southeastern Pennsylvania Transit Authority (SEPTA) (1999) case, the plaintiffs
recommended that the 1.5-mile run have sex-normed passing scores. The court stated
that this could be a viable approach, but that the job relatedness of each passing score
and percentile rank would need to be established. Because each passing score cannot be
related to minimal job performance, the court in the SEPTA case did not accept the use of
sex-normed passing scores.

Steps to Ensure Effective Physical Test


Implementation
To achieve accurate test results, tests must be administered in a precise, safe, and
systematic manner. During the initial test development phase, consideration must be
given to the location for test administration to ensure that the final test battery fits in the
available square footage. The steps to ensuring accurate test results include (1)
sequencing the tests to reduce fatigue, (2) providing a safe environment, and (3)
administering the tests correctly.

Page 25 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Once the test battery is finalized, the sequence in which the tests are performed must be
generated to ensure maximum applicant performance. This requires that rest periods
between tests need to be (p. 288) interjected. From a physiological standpoint tests that
require use of the aerobic energy system (e.g., 1.5-mile run) should be administered at
the end of the test battery due to the need for a greater recovery time (McArdle et al.,
2007). These types of tests require a longer rest period to recover prior to the next test.
Tests with a duration of 1–3 minutes use the short-term energy system (lactic acid
system) and require less recovery time. Examples of these include a 440-meter run, arm
endurance, and a pursuit and restrain job simulation. Tests (e.g., one repetition maximum
bench press, arm lift) involving the immediate (stored) energy system [adenosine
triphosphate (ATP) and phosphocreatine (PCr)] require a duration of 3–5 minutes to
recover. Most physical test batteries consist of three to five tests. A test order example
based on the physiological energy systems is (1) arm lift, (2) 300-meter run, (3) sit-ups,
and (4) 1.5-mile run. In addition to the physiological systems, the muscle groups being
measured must be considered. Therefore, it is not desirable to have consecutive tests that
assess the upper body musculature (e.g., push-ups, arm lift).

Prior to conducting physical testing the safety of the participants must be considered.
First, the test order, as described above, should be appropriate. Second, the temperature
in the test area must be at a level not to cause injuries. When the temperatures are high
(e.g., 90°F), the wet-bulb globe temperature index (WBGT) should be used to determine
whether it is safe to test. The WBGT takes into account the effects of temperature,
humidity, and radiant energy and is an indicator of potential for heat stress (e.g., heat
exhaustion, heat stroke). A WBGT index reading of less than 82°F mean little threat of
heat stress, whereas readings between 82°F and 89.9°F indicate an increasing danger
level (Department of the Army, 1980). Physical tests should not be performed if the WBGT
is ≥85°F. Similarly, temperature is not a reliable indicator of how cold an individual feels.
When testing in low temperatures (e.g., 25°F), the temperature and wind speed (wind
chill factor) should be taken into account to reduce the incidence of hypothermia and
frostbite. Third, the surfaces of the test area must be clean and if performing any type of
running event, should be inspected for traction. In addition, there should be no smoking
in the test area and water should be provided to the participants.

The next step in test administration is training test administrators to be consistent and
accurate in performing their duties. Past litigation has shown that what occurs during
test administration may lead to disparate treatment of applicants (e.g., Belk v.
Southwestern Bell Telephone Company, 1999). Thus, a detailed test manual that includes
all test procedures must be generated. Reading a test manual prior to giving the tests is
not adequate. Test administrators must be trained in the test set-up, calibration of
equipment, administration procedures, and scoring. Furthermore, to ensure a consistent
test setting, test administrators must read test instructions to applicants and not provide
any encouragement or feedback. Organizations must establish protocols for providing
information to applicants (e.g., DVD, letter, internet site). Inadequate attention to these
factors can change the test and threaten the test validity. Employers must determine
whether internal or external (third party provider) parties will administer the test.
Page 26 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Typically, this is determined by the costs related to testing, complexity of the tests,
timeline for making job offers, and qualifications and availability of in-house staff. A final
consideration is whether to allow applicants to retest. If retests are permissible, a retest
policy that addresses organizational needs and the potential for physical ability
improvement should be generated. Past research demonstrated that men and women can
increase their physical capabilities (e.g., aerobic capacity) significantly in a 2-month to 3-
month period, as well as decrease their capabilities due to inactivity, injury, or aging
(Knapik et al., 2006; Kraemer et al., 2001; McArdle et al., 2007; Nindl et al., 2007).

Finally, placement of the physical tests in the selection process requires consideration of
costs to the organization. The most common and cost-effective placement is prior to the
conditional offer of a job. However, if any tests require measurement of perceived medical
data (e.g., heart rate) or the test battery is administered by medical personnel (e.g.,
physical therapist, nurse), they must be given postjob offer (ADA, 1990). Issues related to
liability for injury should be addressed. There are published guidelines for screening
individuals prior to exercise that indicate that measurement of heart rate and blood
pressure should be taken to assess the risk for testing (Thompson, Gordon, & Pescatello,
2010). Because these measures violate ADA, employers mitigate their risk by using
waiver forms or a medical clearance indicating the individual is not at risk to participate
in the testing. The cost to administer the physical test increases when the test is given by
medical personnel after a conditional job offer. (p. 289)

Litigation Related to Physical Testing


As described earlier, physiological differences between men and women have resulted in
men having significantly higher test scores than women on many physical performance
tests. These differences are especially large for tests that assess muscular strength,
muscular endurance, or anaerobic power. The test score differences typically result in
passing rates that do not meet the four-fifths rule with the women's passing rate being
less than 80% of the men's passing rate at almost all levels of selection ratio. The Uniform
Guidelines (EEOC, 1978) indicate that adverse impact is permissible if the validity, test
fairness, and business necessity have been established. Although adverse impact against
women is common to physical assessment, past research has found them to be equally
predictive and fair for both men and women (Blakely et al., 1994; Baker, Gebhardt,
Billerbeck, & Volpe, 2006; Hogan, 1991a).

As women applied for nontraditional jobs in the 1970s and 1980s, the courts began to
review the methods used to assess the job relatedness of physical tests. Whether a test
was upheld depended upon how tests are developed, validated, and administered (e.g.,
Porch v. Union Pacific Railroad, 1997; Varden v. Alabaster, AL et al., 2005). Previous
articles drew similar conclusions in regard to physical performance test litigation (Hogan
& Quigley, 1986; Terpstra, Mohamed, & Kethley, 1999).

Page 27 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Hogan and Quigley (1986) conducted a review of 44 public safety Title VII physical
testing cases and found the majority (70%) ruled in favor of the plaintiff. In many cases,
the plaintiffs prevailed due to a lack of or the quality of a job analysis or validation study.
For tests supported by the courts, no specific validation strategy was identified as being
more effective than another strategy. Use of content validity was supported for work
sample tests (Hardy v. Stumpf, 1978), but not for basic ability tests (e.g., Berkman v. City
of New York, 1982; Harless v. Duck, 1980). Furthermore, many tests that used a criterion-
related validation strategy were not upheld due to the quality of the study. The lessons
learned from these court decisions were that physical tests must be validated using
thorough, high-quality job analysis and validation procedures.

The courts continued to examine the quality of the job analysis and validation procedures
and, in some cases, found them lacking (e.g., United States v. City of Erie, 2005; Varden v.
Alabaster, AL, et al., 2005). For example, a firefighter evolution used for selection
included a ladder lift task. However, the job analysis did not support the use of this ladder
lift task and the test did not withstand legal scrutiny (Legault v. Russo, 1994). Since
Hogan and Quigley's 1986 review, other issues have come to the forefront. These include
(1) passage of the Americans with Disabilities Act of 1990, (2) challenges to mandatory
retirement age, (3) use of physical performance tests for incumbent assessment, (4)
evidence of the job relatedness of the passing score, (5) the appropriateness of the
criterion measure, and (6) administration and application of physical tests.

Americans with Disabilities Act of 1990 (ADA)

Title I of the ADA (1990) was intended to prevent employment discrimination in the
private sector based on physical or mental disability. In the federal sector, the
Rehabilitation Act of 1973 protects individuals with disabilities. Title I of ADA directly
applies to physical performance testing. The ADA states that physical tests may be given
prior to a conditional offer of employment, but if medical measures (e.g., heart rate) are
assessed the test must be given after the job offer. In addition, testing procedures must
be job related and not intended to identify and screen out individuals with disabilities.
Finally, reasonable accommodations for tests can be requested by individuals and these
accommodations must be considered and provided if they do not pose an undue hardship
on an organization. [42 U.S.C. § 12112 (b) (5–6) (d) (2–4)].

The inability to gather medical information prior to a conditional job offer affects physical
performance testing in two ways. First, tests that evaluate aerobic capacity are impacted.
A common method to handle this problem is by using the 1.5-mile run or a test that was
shown to provide the requisite level of aerobic capacity (e.g., step test at specific cadence
for 5 minutes) without collecting medical data. Second, when administering physical tests
prior to a job offer, no measure of health status (e.g., blood pressure) may be taken to
ensure the safe participation of the applicant. To ascertain whether the candidates can
safely participate in physical testing, employers have required a medical certification by a
physician stating the individual can safely complete the tests or a signed waiver form

Page 28 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

from the candidate. However, a waiver does not absolve the organization from being
responsible for the safety of candidates. Candidates injured during testing due to
negligence can recover damages (p. 290) even when a waiver was signed (White v. Village
of Homewood, 1993).

Although most ADA litigation addresses medical issues (e.g., bipolar disorder) (Rothstein,
Carver, Schroeder, & Shoben, 1999), recent litigation related to physical testing after a
medical leave found the test violated ADA (Indergard v. Georgia-Pacific, 2009). The
plaintiff who had worked at a mill for over 15 years took medical leave for knee surgery
and was required to complete and pass a physical capacity evaluation (PCE) that included
several medical and physical assessments (e.g., 65 and 75 lb. lift and carry, range of
motion, heart rate, drug and alcohol use). The plaintiff failed the PCE and claimed it was
a medical evaluation, along with being treated or perceived as disabled. The district court
ruled in favor of Georgia-Pacific. However, the court of appeals agreed that the PCE was a
medical evaluation and remanded the decision. The ADA states that an employer cannot
require a current employee to undergo a medical examination unless it is proven to be job
related [42 U.S.C. §12112(d)(4)(A)]. In making this decision, the court of appeals relied on
the EEOC ADA Enforcement Guidance's (1995) factors for determining if an assessment
is physical or medical. These factors included the following:

1. Is the test administered and/or interpreted by a health care professional,


2. Is the test designed to reveal a physical or medical impairment,
3. Is the test invasive,
4. Does the test measure an employee's task performance or physiological response
to the task, and
5. Is the test administered in a medical setting or use medical equipment.

The court of appeals ruled that the PCE met factors 1, 2, and 4 and deemed the PCE to be
a medical examination. Another point of concern the court had for the PCE was that the
medical assessments conducted by the PCE were not linked to job demands. Thus, the
court ruled in favor of the plaintiff and found the PCE to be a medical examination and
not job related. This case was clearly decided on the many medical assessments in the
PCE (e.g., medical history, evaluation of body mechanics) and not the merit of the
physical test.

The issue of reasonable accommodation for a physical performance test was of concern in
the Belk v. Southwestern Bell Telephone Company (1999) case. In this case, the plaintiff
who suffered from the residual effects of polio and wore leg braces wanted to transfer to
a physically demanding job. The plaintiff was granted a test accommodation, but failed
the test battery and filed under ADA (1990) because other test accommodation requests
(e.g., drag ladder on ground) were rejected. The initial court decision ruled that the test
was valid. However, the same court ruled that Belk met the standard for disabled under
the ADA, and ruled in favor of the plaintiff. The decision was appealed and on appeal, the
decision was vacated due to improper jury instruction related to business necessity under

Page 29 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

the ADA. The Belk case emphasizes the importance of training administrators on how to
implement test accommodations.

There are other legal cases in which incumbents who failed physical performance tests
filed suit under the ADA and/or Rehabilitation Act of 1973 claiming that their test status
was due to being disabled or being perceived as disabled (Andrews v. State of Ohio, 1997;
Smith v. Des Moines, 1996). In both of these cases, the court ruled in favor of the
defendant dismissing the incumbent claims of disability.

Challenges to Mandatory Retirement and Physical Performance


Testing

The use of mandatory retirement ages and related physical assessments resulted in
litigation for two law enforcement agencies. During the 1980s several law enforcement
agencies in the Commonwealth of Massachusetts were consolidated and reorganized
under the Massachusetts Department of State Police. This merger resulted in all
personnel being given the job title of trooper. To accommodate all agencies the
mandatory retirement age was set at 55. This age was an increase from 50 for the state
police and a decrease from 60 for other agencies. In 1992, officers from the former
agencies sued the Commonwealth under the Age Discrimination in Employment Act of
1967 (ADEA, 1967) to nullify the new mandatory retirement age of 55 (Gately v.
Massachusetts, 1992). Later, the court ruled that troopers over the mandatory retirement
age could continue to work for the Massachusetts State Police. Their continued
employment was contingent upon demonstrating the ability to perform the physical
aspects of the job by completing and passing an annual physical performance test (Gately
v. Massachusetts, 1996). In that same year, the Massachusetts State Legislature passed a
law mandating annual physical testing for incumbent troopers regardless of age.
Subsequently, a physical performance test was (p. 291) developed and validated for the
position (Gebhardt & Baker, 2006). This test was implemented for the assessment of
incumbent troopers’ physical capabilities and job status. Troopers passing the test can
continue their employment with the State Police. Troopers who do not pass the test are
dismissed from the state police.

A more recent case related to a mandatory retirement age of 55 resulted in a decision for
the defendant (Badgley & Whitney v. Walton, Commissioner of Public Safety, 2008). All
troopers in the Vermont State Police had to complete an annual physical assessment (e.g.,
sit-ups, push-ups, 1.5-mile run). Passing the annual physical assessment was based on
achieving the sex- and age-normed passing scores and was intended to determine the
troopers’ fitness levels. The plaintiffs indicated that they had passed their annual physical
test each year and had favorable job evaluations. Despite this evidence, the court ruled in
favor of the Vermont State Police and the mandatory retirement age of 55. The court
indicated that although the plaintiffs passed the physical assessment, the sex- and age-
normed passing scores were not linked to minimum physical job requirements. The court
also used testimony related to age decrements in physical, cognitive, and psychomotor

Page 30 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

(reaction time) factors in their decision. On appeal to the Vermont Supreme Court, the
lower court ruling was upheld in favor of the defendant (Badgley and Whitney v. Walton,
Sleeper, Commissioners of Public Safety, and Vermont Department of Public Safety, 2010).

Use of Physical Performance Tests for Incumbent Assessment

Some public safety agencies have implemented incumbent physical assessments with
repercussions of failure ranging from loss of job or inability to be promoted, transferred,
or receiving special assignments to not receiving additional monetary bonuses or
vacation days. For example, the Nuclear Regulatory Commission in the Federal Code of
Regulations (10CFR73.55) requires all nuclear security officers to complete and pass an
annual physical test. Failure to pass the test results in suspension from the job or duty at
another job with no weapons requirements. Court decisions in this area have indicated
that employers can implement these tests and use the results to make employment
decisions. However, because employment decisions are being made based on the test
results, the tests must comply with the same legal standards as candidate selection tests
(Andrews v. State of Ohio, 1997; Pentagon Force Protection Agency v. Fraternal Order of
Police DPS Labor Committee, 2004; UWUA Local 223 & The Detroit Edison Co, 1991).

Evidence of the Job Relatedness of Passing Scores

The legal case pertaining to physical performance testing that has received the most
attention (Sharf, 1999, 2003) in recent years is Lanning v. Southeastern Pennsylvania
Transportation Authority (SEPTA) (1999, 2002). This case examined the use of a 1.5 mile
run test to select transit police officers and the corresponding passing score. SEPTA
established a passing score for the 1.5 mile run at 12 minutes, which was equivalent to an
aerobic capacity level of 42.5 ml kg-1 min-1. The impact of this passing score was a 55.6%
passing rate for men and a 6.7% passing rate for women. Because these differential
passing rates resulted in an adverse impact, the 1.5-mile run was challenged based on
sex discrimination. In the 1999 case, the District Court decided that SEPTA had
established the job relatedness of the test and passing score and ruled in favor of SEPTA.
This decision was appealed and the case was remanded back to the District Court
because the court applied a more lenient standard for determining business necessity.
The District Court used the business necessity standard based on the Wards Cove Packing
Company, Inc. v. Atonio (1989) case rather than Griggs v. Duke Power Company (1971)
case. In the second trial in 2002, the District Court again ruled in favor of SEPTA. The
Lanning case demonstrated that a stricter burden is now applied to prove the business
necessity of a test and its passing score. In addition, the ruling indicated that minimally
acceptable levels of performance did not have to reflect the fitness levels of current
employees.

Page 31 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

In United States v. City of Erie (2005), the police department implemented a physical
selection test that required candidates to complete an obstacle course and a specific
number of push-ups and sit-ups in a specified time. The test was developed without
reference to any job analysis data. Erie collected separate scores from incumbents for the
three test components (obstacle course, sit-ups, push-ups) and then summed the means
for the three components to generate a single passing score. The single passing score
was used as the candidate passing score. Thus, the candidates were not scored in the
same manner as used in the passing score study. Erie defended the (p. 292) test by
stating that the obstacle course portion of the test was content valid and the push-up and
sit-up portions possess construct and/or criterion-related validity. Erie's expert suggested
that the metabolic demands associated with the test were similar to the metabolic
demands of police work, but could not provide evidence confirming this notion. The court
rejected the City's expert opinions and ruled in favor of the plaintiff.

The Erie case showed, as did other cases, that job analysis is important to test
development and validation. Furthermore, the court provided a listing of other criteria for
test development and validation. First, it stated that the principles and standards used in
I/O psychology are relevant to determining whether physical performance tests are job
related and meet business necessity. Second, the test must be administered to applicants
in the same manner it was validated. Third, consistent with the Uniform Guidelines
(1978), validity and job relatedness of the test, its components, and the passing score
must be determined and documented using professionally accepted methods.

Appropriateness of the Criterion Measure

Most physical testing litigation is related to the test, but a recent case addressed the
criterion measure used to validate the test stating it must be relevant and not confounded
by other measures (EEOC v. Dial Corp, 2006). The test involved lifting 35-lb. bars and
placing them on 30- and 60-inch high racks. The test was challenged on the basis of
disparate impact on women. The EEOC stated that the test overrepresented the physical
demands of the job and that subjective evaluations by test administrators resulted in
some women failing the test. Dial defended the test with information claiming that the
test has reduced on-the-job injuries. When the reduction of the on-the-job injury claim
was investigated, it was found that the decreases in the number of injuries incurred
began 2 years before the test was implemented. In addition, other programs were being
implemented (e.g., job rotation, job redesign) by Dial at the same time as the physical
performance test. Thus, the contribution of the physical performance test to the reduction
of injuries was confounded by other safety programs and the criterion measure of injury
reduction was not an acceptable measure. Because Dial could not adequately
demonstrate that the test was related to business necessity, the court ruled in favor of the
plaintiff.

Page 32 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Administration, Scoring, and Application of Physical Performance


Tests

Although the development and validation of physical performance tests have become
more sophisticated over time, some challenges to tests are based on factors other than
job analysis, test validation, or passing scores. In the EEOC v. Dial Corp (2006) case, one
weakness of the test identified by the court was the use of subjective scoring of applicant
test performance. Recent litigation related to physical testing found that the test was
administered inconsistently (Merritt v. Old Dominion Freight Line, 2010). In this case the
plaintiff was a woman employed as a line haul driver by Old Dominion. The plaintiff
wanted to transfer to a more physically demanding pick up and delivery driver job that
involved lifting, carrying, and moving freight. On two occasions the plaintiff was passed
over for the transfer, but was finally transferred to the pick up and delivery driver job
without completing a physical performance test. Six months later, the plaintiff sustained
an injury and was placed on medical leave. After being medically cleared to return to
work, she was required to complete a physical ability test that reflected tasks performed
on the job. The plaintiff failed the physical test and was terminated. She challenged the
test on the basis of its administration. The court determined that the test was not
administered to all pick up and delivery driver candidates or for return to work after an
injury. Based on the inconsistent administration of the test, the appeals court ruled in
favor of the plaintiff.

Physical Test Preparation and Reduction of


Adverse Impact
It is a well know fact that women do not perform as well as men on physical tests. Similar
to test preparation guides for cognitive tests, physical test preparation programs are used
to prepare applicants for employment tests. These preparation programs take two
formats. The first is a pamphlet, DVD, or internet site that informs applicants about the
tests and provides exercises or a suggested fitness program that if performed will
increase the individual's fitness level. The second type is a prejob fitness program in
which the hiring organization provides fitness training targeted at the physical tasks in
the prospective job or the tests in the battery. Past research demonstrated that not only
does an individual's likelihood of passing the test increase, but that the passing rate for
individuals who participate in the programs versus those who do not participate or drop
out of (p. 293) the programs is higher (Baker & Gebhardt, 2005b; Gebhardt & Crump,
1990; Hogan & Quigley, 1994; Knapik, Hauret, Lange, & Jovag, 2003). These programs
have reduced the number of failures on physical tests, especially for women.

Page 33 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

There are other methods to reduce adverse impact in hiring and maintain test utility: (1)
use targeted recruitment of individuals who possess the abilities required on the job (e.g.,
health club), (2) provide a realistic preview of the job demands, (3) when using job
simulations, allow applicants to practice the test skills, and (4) use a scoring approach
that minimizes adverse impact. A combination of these approaches and a preemployment
physical training program will result in a reduction in adverse impact.

Summary
Physical tests have been shown to be valid predictors of jobs with moderate to high
physical demand and legally defensible when supported with detailed job analysis and
validity studies. The demands of arduous jobs can be identified through observation,
incumbent appraisal, and direct measurement. The combination of these methods allows
the researcher to design tests that are predictive of job performance. Furthermore, well-
designed and validated test batteries have reduced on-the-job injuries and days lost from
work for jobs requiring higher levels of muscular strength, muscular endurance, and/or
anaerobic power. Although there are test score differences across subgroups (e.g., sex,
race and national origin, age), physical tests have demonstrated test fairness. To reduce
test score differences employers can (1) conduct a comprehensive job analysis to identify
the job demand, (2) design tests that minimize adverse impact and assess the physical
demands of the job, (3) empirically demonstrate the job relatedness of the tests, and (4)
develop candidate preparation guides and programs.

References
Akima, H., Kano, Y., Enomoto, M., Ishizu, M., Okada, Y., & Oishi, S. (2001). Muscle
function in 164 men and women aged 20–84 yr. Medicine and Science in Sports and
Exercise, 33, 220–226.

American Educational Research Association, American Psychological Association, &


National Council on Measurement in Education. (1999). Standards for educational and
psychological testing. Washington, DC: American Educational Research Association.

Anderson, C. K. (2003). Physical ability testing for employment decision purposes. In W.


Karwowski & W. S. Marras (Eds.), Occupational ergonomics: Engineering and
administrative controls (pp. 1–8). New York: Routledge.

Arvey, R. D., Landon, T. E., Nutting, S. M., & Maxwell, S. E. (1992). Development of
physical ability tests for police officers: A construct validation approach. Journal of
Applied Psychology, 77, 996–1009.

Page 34 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Astrand, P., Rodahl, K., Dahl, H. A., & Stromme, S. G. (2003). Textbook of work physiology
(4th ed.) Champaign, IL: Human Kinetics.

Ayoub, M. M., & Mital, A. (1989). Manual material handling. London: Taylor & Francis.

Baker, T. A. (2007). Physical performance test results across ethnic groups: Does the type
of test have an impact? Bowling Green, OH: Society of Industrial and Organizational
Psychology.

Baker, T. A., & Gebhardt, D. L. (1994). Cost effectiveness of the trackman physical
performance test and injury reduction. Hyattsville, MD: Human Performance Systems,
Inc.

Baker, T. A., & Gebhardt, D. L. (2005a). Development and validation of selection


assessments for Energy Northwest nuclear security officers. Beltsville, MD: Human
Performance Systems, Inc.

Baker, T. A., & Gebhardt, D. L. (2005b). Examination of revised passing scores for state
police physical performance selection tests. Beltsville, MD: Human Performance Systems,
Inc.

Baker, T. A., Gebhardt, D. L., Billerbeck, K. T., & Volpe, E. K. (2008). Development and
validation of physical performance tests for Virginia Beach Police Department: Job
analysis. Beltsville, MD: Human Performance Systems, Inc.

Baker, T. A., Gebhardt, D. L., Billerbeck, K. T., & Volpe, E. K. (2009). Validation of physical
tests for all police ranks. Beltsville, MD: Human Performance Systems, Inc.

Baker, T. A., Gebhardt, D. L., & Curry, J. E. (2004). Development and validation of physical
performance tests for selection and assessment of Southern California Edison nuclear
armed security officers. Beltsville, MD: Human Performance Systems, Inc.

Baker, T. A., Gebhardt, D. L., & Koeneke, K. (2001). Injury and physical performance tests
score analysis of Yellow Freight System dockworker, driver, hostler, and mechanic
positions. Beltsville, MD: Human Performance Systems, Inc.

Bartlett, C. J., Bobko, P., Mosier, S. B., & Hannan, R. (1978). Testing for fairness with a
moderated multiple regression strategy: An alternative to differential analysis. Personnel
Psychology, 31, 233–241.

Baumgartner, T. A., & Jackson, A. S. (1999). Measurement for evaluation in physical


education and exercise science (6th ed.) Dubuque, IA: William C. Brown.

Baumgartner, T. A., & Zuideman, M. A. (1972). Factor analysis of physical tests. Research
Quarterly, 43, 443–450.

Page 35 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Bilzon, J. L., Scarpello, E. G., Smith, C. V., Ravenhill, N. A., & Rayson, M. P. (2001).
Characterization of the metabolic demands of simulated shipboard Royal Navy fire-
fighting tasks. Ergonomics, 44, 766–780.

Blakely, B. R., Quinones, M. A., Crawford, M. S., & Jago, I. A. (1994). The validity of
isometric strength tests. Personnel Psychology, 47, 247–274.

Bortz, W. M. I., & Bortz, W. M. I. (1996). How fast do we age? Exercise performance over
time as a biomarker. Journals of Gerontology Series A: Biological Sciences & Medical
Sciences, 51, 223–225.

Buskirk, E. R. (1992). From Harvard to Minnesota: Keys to our history. In J. O. Holloszy


(Ed.), Exercise and sport sciences reviews (pp. 1–26). Baltimore: Williams & Wilkins.

Cascio, W. F., Outtz, J. L., Zedeck, S., & Goldstein, I. L. (1991). Statistical implications of
six methods of test score use in personnel selection. Human Performance, 4, 233–264.

Chaffin, D. B., Herrin, G. D., Keyserling, W. M., & Foulke, J. A. (1977). Pre-
(p. 294)

employment strength testing in selecting workers for materials handling jobs. Cincinnati,
OH: National Institute for Occupational Safety and Health, Physiology, and Ergonomics
Branch, Report CDC-99-74-62.

Courtville, J., Vezina, J., & Messing, K. (1991). Comparison of the work activity of two
mechanics: A woman and a man. International Journal of Industrial Ergonomics, 7, 163–
174.

Craig, B. N., Congleton, J. J., Kerk, C. J., Amendola, A. A., & Gaines, W. G. (2006). Personal
and non-occupational risk factors and occupational injury/illness. American Journal of
Industrial Medicine, 49, 249–260.

Department of the Army. (1980). Prevention, treatment and control of heat injury.
Washington, DC: Department of the Army.

Dorman, L. E., & Havenith, G. (2009). The effects of protective clothing on energy
consumption during different activities. European Journal of Applied Physiology, 105,
463–470.

Fleishman, E. A. (1964). Structure and measurement of physical fitness. Englewood, NJ:


Prentice Hall.

Fleishman, E. A. (1995). Rating scale booklet: Fleishman job analysis survey. Bethesda,
MD: Management Research Institute, Inc.

Fleishman, E. A., Gebhardt, D. L., & Hogan, J. C. (1986). The perception of physical effort
in job tasks. In G. Borg & D. Ottoson (Eds.), The perception of exertion in physical work.
Stockholm, Sweden: Macmillan Press Ltd.

Page 36 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Gebhardt, D. L. (1984a). Center of mass displacement for linemen in the electric industry.
In D. Winter, R. Norman, R. Wells, K. Hayes, & A. Patia (Eds.), Biomechanics, IX-A. (pp.
66–71). Champaign, IL: Human Kinetics.

Gebhardt, D. L. (1984b). Revision of physical ability scales. Bethesda, MD: Advanced


Research Resources Organization.

Gebhardt, D. L. (2000). Establishing performance standards. In S. Constable & B. Palmer


(Eds.), The process of physical fitness standards development (pp. 179–200). Wright-
Patterson AFB, OH: Human Systems Information Analysis Center (HSIAC-SOAR).

Gebhardt, D. L. (2007). Physical performance testing: What is the true impact? Bowling
Green, OH: Society of Industrial and Organizational Psychology.

Gebhardt, D. L., & Baker, T. A. (1992). Development and validation of physical


performance tests for trackmen. Hyattsville, MD: Human Performance Systems, Inc.

Gebhardt, D. L., & Baker, T. A. (1997). Development and validation of a lashing physical
performance test for selection of casuals. Hyattsville, MD: Human Performance Systems,
Inc.

Gebhardt, D. L., & Baker, T. A. (1997). Comparison of performance differences on


standardized physiological tests for women in public safety and industrial jobs.
Indianapolis, IN: American College of Sports Medicine.

Gebhardt, D. L., & Baker, T. A. (2001). Reduction of worker compensation costs through
the use of pre-employment physical testing. Medicine and Science in Sports and Exercise,
33, 111.

Gebhardt, D. L., & Baker, T. A. (2007). Development and validation of physical


performance tests for selection of New Jersey State enlisted members. Beltsville, MD:
Human Performance Systems, Inc.

Gebhardt, D. L., & Baker, T. A. (2010). Physical performance tests. In J. Farr & N. Tippins
(Eds.), Handbook on employee selection (pp 277–298). New York: Routledge.

Gebhardt, D. L., Baker, T. A., Curry, J. E., & McCallum, K. (2005). Development and
validation of medical guidelines and physical performance tests for U. S. Senate Sergeant
at Arms positions: Volume I & II. Beltsville, MD: Human Performance Systems, Inc.

Gebhardt, D. L., Baker, T. A., & Phares, D. A. (2008). Development and validation of
physical performance tests for California Highway Patrol (Volume 1: Job analysis).
Beltsville, MD: Human Performance Systems, Inc.

Gebhardt, D. L., Baker, T. A., & Sheppard, V. A. (1998). Development and validation of
physical performance tests for BellSouth physically demanding jobs. Hyattsville, MD:
Human Performance Systems, Inc.

Page 37 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Gebhardt, D. L., Baker, T. A., & Sheppard, V. A. (1999a). Development and validation of
physical performance tests for the selection and fitness assessment for uniformed
members of the Massachusetts State Police. Hyattsville, MD: Human Performance
Systems, Inc.

Gebhardt, D. L., Baker, T. A., & Sheppard, V. A. (1999b). Development and validation of a
physical performance test for the selection of City of Chicago paramedics. Hyattsville,
MD: Human Performance Systems, Inc.

Gebhardt, D. L., Baker, T. A., & Thune, A. (2006). Development and validation of physical
performance, cognitive, and personality assessments for selectors and delivery drivers.
Beltsville, MD: Human Performance Systems, Inc.

Gebhardt, D. L., & Crump, C. E. (1984). Validation of physical performance selection tests
for paramedics. Bethesda, MD: Advanced Research Resources Organization.

Gebhardt, D. L., & Crump, C. E. (1990). Employee fitness and wellness programs in the
workplace. American Psychologist, 45, 262–272.

Gebhardt, D. L., Schemmer, F. M., & Crump, C. E. (1985). Development and validation of
selection tests for longshoremen and marine clerks. Bethesda, MD: Advanced Research
Resources Organization.

Gibson, W. M., & Caplinger, J. A. (2007). Transportation of validation results. In S. M.


McPhail (Ed.), Alternative validation strategies: Developing new and leveraging existing
validity evidence (pp. 29–81). San Francisco, CA: Jossey-Bass .

Gilliam, T., & Lund, S. J. (2000). Injury reduction in truck driver/dock workers through
physical capability new hire screening. Medicine and Science in Sports and Exercise, 32,
S126.

Gledhill, N., & Jamnik, V. K. (1992a). Characterization of the physical demands of


firefighting. Canadian Journal of Sports Science, 17, 207–213.

Gledhill, N., & Jamnik, V. K. (1992b). Development and validation of a fitness screening
protocol for firefighter applicants. Canadian Journal of Sports Science, 17, 199–206.

Golding, L. A. (2000). YMCA fitness testing and assessment manual (4 ed.) Champaign, IL:
Human Kinetics Publishers.

Hoffman, C. C. (1999). Generalizing physical ability test validity: A case study using test
transportability, validity generalization, and construct-related validity evidence. Personnel
Psychology, 52, 1019–1041.

Hoffman, C. C., Rashkovsky, B., & D’Egido, E. (2007). Job component validity:
Background, current research, and applications. In S. M. McPhail (Ed.), Alternative

Page 38 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

validation strategies: Developing new and leveraging existing validity evidence (pp. 82–
121). San Francisco, CA: Jossey-Bass).

Hogan, J. C. (1991a). Physical Abilities. In M. D. Dunnette & L. M. Hough (Eds.),


Handbook of industrial & organizational (p. 295) psychology (Vol. 2, pp. 753–831). Palo
Alto, CA: Consulting Psychologist Press.

Hogan, J. C. (1991b). Structure of physical performance in occupational tasks. Journal of


Applied Psychology, 76, 495–507.

Hogan, J. C., & Quigley, A. M. (1986). Physical standards for employment and the courts.
American Psychologist, 41, 1193–1217.

Hogan, J. C., & Quigley, A. M. (1994). Effects of preparing for physical ability tests. Public
Personnel Management, 23, 85–104.

Inbar, O., Bar-Or, O., & Skinner, J. S. (1996). The Wingate anaerobic test. Champaign, IL:
Human Kinetics.

Jackson, A. S. (1971). Factor analysis of selected muscular strength and motor


performance tests. Research Quarterly, 42, 172.

Jackson, A. S., Osburn, H. G., Laughery, K. R., & Vaubel, K. P. (1992). Validity of isometric
tests for predicting the capacity to crack, open and close industrial valves. Proceedings of
the Human Factors Society 36th Annual Meeting, 1, 688–691.

Jeanneret, P. R. (1992). Applications of job component/synthetic validity to construct


validity. Human Performance, 5, 81–96.

Johnson, J. W. (2007). Synthetic validity: A technique of use (finally). In S. M. McPhail


(Ed.), Alternative validation strategies: Developing new and leveraging existing validity
evidence (pp. 122–158). San Francisco, CA: Jossey-Bass.

Jones, B. H., & Hansen, B. C. (2000). An armed forces epidemiological board evaluation of
injuries in the military. American Journal of Preventive Medicine, 18, 14–25.

Joyner, M. J. (1993). Physiological limiting factors and distance running: Influence of


gender and age on record performances. In J. O. Holloszy (Ed.), Exercise and sport
science reviews (pp. 103–133). Baltimore, MD: Williams & Wilkins.

Kenney, W. L., Hyde, D. E., & Bernard, T. E. (1993). Physiological evaluation of liquid-
barrier, vapor-permeable protective clothing ensembles for work in hot environments.
American Industrial Hygiene Association Journal, 54, 397–402.

Knapik, J. J., Darakjy, S., Hauret, K. G., Canada, S., Scott, S., Rieger, W., et al. (2006).
Increasing the physical fitness of low-fit recruits before basic combat training: An
evaluation of fitness, injuries, and training outcomes. Military Medicine, 171, 45–54.

Page 39 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Knapik, J. J., Hauret, K., Lange, J. L., & Jovag, B. (2003). Retention in service of recruits
assigned to the army physical fitness test enhancement program in basic combat training.
Military Medicine, 168, 490–492.

Knapik, J. J., Jones, S. B., Darakjy, S., Hauret, K. G., Bullock, S. H., Sharp, M. A., et al.
(2007). Injury rates and injury risk factors among U.S. Army wheel vehicle mechanics.
Military Medicine, 172, 988–996.

Kraemer, W. J., Mazzetti, S. A., Nindl, B. C., Gotshalk, L. A., Volek, J. S., Bush, J. A., et al.
(2001). Effect of resistance training on women's strength/power and occupational
performances. Medicine and Science in Sports and Exercise, 33, 1011–1025.

Landy, F. J., Bland, R. E., Buskirk, E. R., Daly, R. E., DeBusk, R. F., Donovan, E. J., et al.
(1992). Alternatives to chronological age in determining standards of suitability for public
safety jobs. Technical Report. University Park, PA: The Center for Applied Behavioral
Sciences.

Landy, F. J., & Conte, J. M. (2007). Work in the 21st century: An introduction to industrial
and organizational psychology. Malden, MA: Blackwell Publishing.

Lygren, H., Dragesund, T., Joensen, J., Ask, T., & Moe-Nilssen, R. (2005). Test-retest
reliability of the Progressive Isoinertial Lifting Evaluation (PILE). Spine, 30, 1070–1074.

Lynch, N. A., Metter, E. J., Lindle, R. S., Foazard, J. L., Tobin, J. D., Roy, T. A., et al. (1999).
Muscle quality-I: Age-associated differences between arm and leg muscles. Journal of
Applied Physiology, 86, 188–194.

Mayer, T. G., Barnes, D., Nichols, G., Kishino, N. D., Coval, K., Piel, B., et al. (1988).
Progressive isoinertial lifting evaluation. II. A comparison with isokinetic lifting in a
disabled low-back pain industrial population. Spine, 13, 998–1002.

Mayer, T. G., Gatchel, R., & Mooney, V. (1990). Safety of the dynamic progressive
isoinertial lifting evaluation (PILE) test. Spine, 15, 985–986.

McArdle, W. D., Katch, F. I., & Katch, V. L. (2007). Exercise physiology: Energy, nutrition,
and human performance; physiology (5th ed.). Baltimore, MD: Lippincott Williams &
Wilkins.

McGinnis, P. M. (2007). Biomechanics of sport and exercise (2nd ed.) Champaign, IL:
Human Kinetics.

Myers, D. C., Gebhardt, D. L., Crump, C. E., & Fleishman, E. A. (1993). The dimensions of
human physical performance: Factor analyses of strength, stamina, flexibility, and body
composition measures. Human Performance, 6, 309–344.

Page 40 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Nindl, B. C., Barnes, B. R., Alemany, J. A., Frykman, P. N., Shippee, R. L., & Friedl, K. E.
(2007). Physiological consequences of U.S. Army Ranger training. Medicine and Science
in Sports and Exercise, 39, 1380–1387.

Rothstein, M. A., Carver, C. B., Schroeder, E. P., & Shoben, E. W. (1999). Employment law
(2nd ed.) St. Paul, MN: West Group.

Safrit, M. J., & Wood, T. M. (1989). Measurement concepts in physical education and
exercise science. Champaign, IL: Human Kinetics Books.

Schenk, P., Klipstein, A., Spillmann, S., Stroyer, J., & Laubli, T. (2006). The role of back
muscle endurance, maximum force, balance and trunk rotation control regarding lifting
capacity. European Journal of Applied Physiology, 96, 146–156.

Sharf, J. C. (1999). Third circuit's Lanning v. SEPTA decision: Business necessity requires
setting minimum standards. The Industrial-Organizational Psychologist, 37, 149.

Sharf, J. C. (2003). Lanning revisited: The third circuit again rejects relative merit. The
Industrial-Organizational Psychologist, 40, 40.

Society for Industrial and Organizational Psychology [SIOP] (2003). Principles for the
validation and use of personnel selection procedures (4th ed.) Bowling Green, OH:
Society for Industrial and Organizational Psychology, Inc.

Sothmann, M. S., Gebhardt, D. L., Baker, T. A., Kastello, G., & Sheppard, V. A. (1995).
Development and validation of physical performance tests for City of Chicago
Firefighters. Volume 3—Validation of physical tests. Hyattsville, MD: Human Performance
Systems, Inc.

Sothmann, M. S., Gebhardt, D. L., Baker, T. A., Kastello, G. M., & Sheppard, V. A. (2004).
Performance requirements of physically strenuous occupations: Validating minimum
standards for muscular strength and endurance. Ergonomics, 47, 864–875.

Sothmann, M. S., Saupe, K., Jasenof, D., & Blaney, J. (1992). Heart rate response of
firefighters to actual emergencies. Journal of Occupational Medicine, 34, 797–800.

Sothmann, M. S., Saupe, K., Jasenof, D., Blaney, J., Donahue-Fuhrman, S., Woulfe, T., et al.
(1990). Advancing age and (p. 296) the cardiovascular stress of fire suppression:
Determining the minimum standard for aerobic fitness. Human Performance, 3, 217–236.

Stevenson, J. M., Andrew, G. M., Bryant, J. T., Greenhorn, D. R., & Thomson, J. M. (1989).
Isoinertial tests to predict lifting performance. Ergonomics, 32, 157–166.

Stevenson, J. M., Greenhorn, D. R., Bryant, J. T., Deakin, J. M., & Smith, J. T. (1996).
Gender differences in performance of a selection test using the incremental lifting
machine. Applied Ergonomics, 27, 45–52.

Page 41 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Terpstra, D. A., Mohamed, A. A., & Kethley, R. B. (1999). An analysis of federal court
cases involving nine selection devices. International Journal of Selection and Assessment,
7, 26–34.

Thompson, W. R., Gordon, N. F., & Pescatello, L. S. (2010). ACSM's guidelines for exercise
testing and prescription (8th ed.) Philadelphia, PA: Wolters Kluwer/Lippincott Williams &
Wilkins.

Ugelow, R. S. (2005). I-O psychology and the Department of Justice. In F. J. Landy (Ed.),
Employment discrimination litigation: Behavioral, quantitative, and legal perspectives
(pp. 463–490). San Francisco, CA: Jossey-Bass.

Walters, T. R., Putz-Anderson, V., Garg, A., & Fine, L. J. (1993). Revised NIOSH equation
for the design and evaluation of manual lifting tasks. Ergonomics, 36, 749–776.

Williams, A. G., Rayson, M. P., & Jone, D. A. (2007). Training diagnosis for a load carriage
test. Journal of Strength Conditioning Research, 18, 30–34.

Federal Laws

Age Discrimination in Employment Act of 1967, 29 U.S.C. Sec. 621, et. seq. (1967).

Americans With Disabilities Act of 1990, 42 U.S.C A.

Civil Rights Act of 1991, S. 1745, 102nd Congress (1991).

Equal Employment Opportunity Commission (1995). ADA enforcement guidance:


Preemployment disability-related questions and medical examinations. Washington, DC:
Equal Employment Opportunity Commission.

Equal Employment Opportunity Commission, Civil Service Commission, Department of


Labor, and Department of Justice. (1978). Uniform Guidelines on Employee Selection
Procedures. Washington, DC: Bureau of National Affairs, Inc.

Rehabilitation Act of 1973, 29 U.S.C. 701 et. seq. (1973).

Legal Cases

Alspaugh v. Michigan Law Enforcement Officers Training Council, 634 N.W.2d 161 (Mich.
App. 2001).

Andrews v. State of Ohio, 104 F.3d 803 (6th Cir., 1997).

Badgley and Whitney v. Walton, Commissioner of Public Safety, VT Superior Court


#538-11-02 Wmcv, 2008.

Page 42 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

Badgley and Whitney v. Walton, Sleeper, Commissioners of Public Safety and Department
of Public Safety, VT Supreme Court #2008–385, 2010.

Belk v. Southwestern Bell Telephone Company 194 F. 3d 946 (8th Cir., 1999).

Berkman v. City of New York, 536 F. Supp. 177, 30 Empl. Prac. Dec (CCH) § 33320 (E.D.
N.Y., 1982).

Bernard v. Gulf Oil Corporation, 643 F.Supp 1494 (E.D. TX, 1986).

EEOC v. Dial Corp, No. 05–4183/4311 (8th Cir., 2006).

Fried v Leidinger, 446 F.Supp 361 (E.D. VA, 1977).

Gately v. Massachusetts, 92-CV-13018-MA (D. Mass. Dec. 30, 1992).

Gately v. Massachusetts, No. 92–13018 (D. Mass. Sept. 26, 1996).

Griggs v. Duke Power Company, 401 U.S. 424 (1971).

Hardy v. Stumpf, 17 Fair Empl. Prac. Cas. (BNA) 468 (Supp. Ct. Cal., 1978).

Harless v. Duck, 22 Fair Empl. Prac. Cas. (BNA) 1073 (6th Cir., 1980).

Indergard v. Georgia-Pacific Corporation, 582 F.3d 1049 (U.S. Court of Appeals, 9th Cir.,
2009).

Lanning v. Southeastern Pennsylvania Transportation Authority, 181 F.3d 478, 482–484


(3rd Cir., 1999).

Lanning v. Southeastern Pennsylvania Transportation Authority, 308 F.3d 286 (3rd Cir.,
2002).

Legault v. Russo, 64 FEP Cases (BNA) 170 (D.N.H., 1994).

Merritt v. Old Dominion Freight Line, Inc., No. 09–1498 (4th Cir., Apr. 9, 2010).

Peanick v. Reno, 95–2594 (8th Cir., 1995).

Pentagon Force Protection Agency v. Fraternal Order of Police DPS Labor Committee,
FLRA Case #WA-CA-04-0251 (Wash. Region, 2004).

Porch v. Union Pacific Railroad, Administrative law proceeding, State of Utah. (1997).

Smith v. Des Moines, #95–3802, 99 F.3d 1466, 1996 U.S. App. Lexis 29340, 72 FEP Cases
(BNA) 628, 6 AD Cases (BNA) 14 (8th Cir., 1996) [1997 FP 11].

United States v. City of Erie, Pennsylvania, #04–4, 352 F. Supp. 2d 1105 (W.D. Pa., 2005).

Page 43 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018


The Assessment of Physical Capabilities in the Workplace

UWUA Local 223 & The Detroit Edison Co., AAA Case No. 54-30-1746–87 (Apr. 17, 1991)
(Lipson, Arb.).

Varden v. City of Alabaster, Alabama and John Cochran; US District Court, Northern
District of Alabama, Southern Division; 2:04-CV-0689-AR (2005).

Wards Cove Packing Company v. Atonio, 490 U.S. 642 (1989).

White v. Village of Homewood, 628 N.E. 2d 616 (Ill. App., 1993).

Todd A. Baker

Todd A. Baker, Human Performance Systems, Inc., Alburtis, PA

Deborah L. Gebhardt

Deborah L. Gebhardt, Human Performance Systems, Inc., Beltsville, MD

Page 44 of 44

PRINTED FROM OXFORD HANDBOOKS ONLINE (www.oxfordhandbooks.com). © Oxford University Press, 2018. All Rights
Reserved. Under the terms of the licence agreement, an individual user may print out a PDF of a single chapter of a title in
Oxford Handbooks Online for personal use (for details see Privacy Policy and Legal Notice).

Subscriber: HINARI; date: 17 October 2018

Das könnte Ihnen auch gefallen