Beruflich Dokumente
Kultur Dokumente
School
of
Social
Work
SW
8602
Direct
Practice
Evaluation
Jane
F.
Gilgun,
Ph.D.,
LICSW
December
2009
Choosing Assessment and Evaluation Tools for Direct Practice
Assessment and evaluation tools can contribute to practice effectiveness if social
service professionals choose them well. In this essay I provide guidelines for choosing
tools for practice. The first section discusses standardized instruments; that is, instruments
that have known psychometric properties of reliability and validity. The second section is
on instruments that practitioners construct themselves or that they help clients construct.
The third section is brief but points out some of the complicated issues involved in
practitioner use of instruments. In the discussion, I state the importance of practitioner
involvement in the development, use, and modification of any tools that agencies may
require and also point out that funders prefer to sponsor programs that demonstrate
effectiveness.
Are they Useful?
Usefulness is the most important question to ask about practice tools. If you use
these tools, will they help you do your job better? Tools that are useful have the following
characteristics. I’ve arranged them in rough order of importance for social work practice.
• They have good face validity. Face validity is the most important validity in
assessment and evaluation tools. Face validity means that when knowledgeable
professionals read the tools, they find that the tools cover important areas of
practice.
• They
have
good
content
validity.
Content
validity
is
an
estimate
of
whether
Gilgun letter
Page 2 of 9
instruments
cover
relevant
areas.
It
is
similar
to
face
validity
in
that
experts
decide
whether tools have adequate coverage. There is no index for content validity.
Drawing upon multiple sources of data helps to ensure content validity. In social
work and other applied disciplines, content validity is more likely when the sources
of items are research, theory, and practice wisdom that draws upon direct
experience with clients and their issues. Sometimes representatives of client groups
contribute to the ideas and items of a tool. Item total analysis often helps streamline
instruments because it helps to eliminate items that quantitative analysis shows are
unrelated to other items in the tool. In item total analysis, the score on each item is
correlated with the total score. Items with very low correlations are eliminated. If
many items have high correlations—above .9—tool developers then inspect these
items for redundancy and eliminate those that duplicate others.
o They are culturally sensitive. Instruments that are useful draw upon
information that is culturally sensitive. Practitioners can check for cultural
sensitivity by finding information about the samples on which instrument
developers draw for the ideas and items that compose the instrument. These
samples ideally match the culture, social class, and other important social
identities of the individuals who compose practitioners’ caseloads. If the
sample differs, the instruments may still be useful if practitioners modify
them in consultation with knowledgeable persons. Cultural sensitivity is part
of content validity.
• They are practice guidelines. Good tools provide practice guidelines in the sense
that
they
help
you
keep
important
things
about
clients
in
mind.
It
is
only
human
to
Gilgun letter
Page 3 of 9
have
our
own
favorite
ideas
about
what
is
important.
Useful
tools
alert
you
to
things that you might not otherwise have thought about.
• They help you formulate treatment goals. Tools that provide practice guidelines
can do this. Treatment goals, in turn, can help you gauge whether you work is
helping clients. If you use the tools periodically, they will also keep you focused on
important practice principles. Of course, as you work with clients, you may
formulate new treatment goals and find some goals that tools helped you develop
are not appropriate for particular clients.
• They are short, easy to use, and modifiable. Most useful tools have these qualities.
If they are long and cumbersome, practitioners may not want to use them because
they take time away from direct client contact. Useful tools are modifiable in the
sense that when some items do not work, practitioners can modify them to fit their
practice. It is better to modify them in consultation with other knowledgeable
professionals in case you are missing something important that the tools provide.
• They have good indices of internal consistency, which is sometimes called
reliability. The index should reach or come close to .90 in clinical assessment and
evaluation tools. Some tools can have lower indices of reliability or none at all and
still be useful if they have face validity and are useful in other ways. Cronbach’s
alpha is the most common index of internal consistency, which gauged from a scale
of 0‐1. A good alpha and good face validity suggest a potentially useful tool.
• They have good indices of interrater reliability, which is another indicator of
consistency. When an instrument has an inter‐rater reliability score, this means
that
two
or
more
practitioners
have
completed
the
instrument
on
the
same
client
or
Gilgun letter
Page 4 of 9
clients.
If
there
are
two
raters,
then
the
number
of
clients
should
be
at
least
15‐20.
If
there are 15‐20 raters or more, then the rating can be done on fewer clients. The
higher the index, on a scale from 0‐1, the more reliable the scale is. The closer to 1
the rating is, the more the raters have agreed. A scale with a high rating or one with
a low rating may have poor face validity, and practitioners decide not to use it. High
face validity and high inter‐rater reliability are good indicators of potential
usefulness. An issue with inter‐rater reliabilities, however, is that practitioners
who fill out the instrument may have different perspectives, ideas, and training on
the concepts that underlie the instruments. Raters, therefore, should understand
the theory, research, and practice wisdom on which the tools are based. An
excellent tool could receive a low inter‐rater reliability score because the raters did
not understand the concepts on which the tool is based.
• They have adequate testretest reliability (TRR). Test‐retest reliability arises
when a group of practitioners fills out an instrument on a group of clients and days
or weeks later fills out the same instrument on the same group of clients. The scores
on the two different occasions are correlated. The indices that are closest to 1 are
those that indicate the best TRRs. Test‐retest reliabilities cannot be done if the
clients are receiving services because any intervention could affect the second set of
scores. Face validity in combination with the reliabilities already discussed suggests
a potentially useful instrument.
• They have indices of construct validity, which help researchers and practitioners
understand what the tools measure. To evaluate for construct validity, researchers
have
practitioners
fill
out
two
instruments
that
are
thought
to
measure
the
same
Gilgun letter
Page 5 of 9
things.
One
of
the
instruments
already
has
known
psychometric
properties
of
reliability and validity. The scores of the two instruments are correlated. The
higher the score, the more valid the construct is thought to be. An instrument with
face validity, construct validity, and good reliabilities is potentially useful.
• When the issue is prediction, they have good predictive validity, which is useful in
some tools, such as risk assessments, whose purpose is to identify individuals at risk
for some conditions. Their predictive usefulness is based upon how well they
predict future behaviors. Child abuse risk assessments are examples. These can be
useful tools because they typically are based upon research and theory and
practitioner expertise. They can provide practice guidelines that help practitioners
formulate treatment goals that, if met, can reduce the risk for the targeted behaviors
to occur. These kinds of instruments have scores from 0‐1, like the other indices of
reliability and validity. They are one of two types of criterion‐related validities. The
other is concurrent validity. “Criterion” refers to the idea that the instrument is
correlated with another external instrument.
• When concurrent validity is an issue, they have good concurrent validity, which is
a score that researchers calculate when they correlate two or more instruments that
they administer at the same time, with one assumed to be a predictor of another.
This test is not much used in direct practice, and it is not the same thing as construct
validity. It is used more to get as complete a picture as possible of whatever
administrators of the instruments want to know about future performance.
• Factor analysis indicators can be helpful in some cases. Factor analyses are
similar
in
some
ways
to
Cronbach’s
alpha
in
that
the
results
of
a
factor
analysis
Gilgun letter
Page 6 of 9
indicate
which
items
of
the
instruments
correlate
with
each
other.
Those
items
that
clump together are factors that researchers named based on which items belong to
which cluster or factor.
SelfConstructed Instruments
In some cases, practitioners may find self‐constructed instruments to be helpful to
their practice. Self‐constructed instruments typically have anchors on both sides of a
continuum and therefore are often called self‐anchored scales. Some call them
individualized rating scales.
One of the main advantages of self‐constructed instruments is that they are by
definition tailor‐made to fit particular, individual treatment situations. Practitioners
construct them to evaluate themselves, to evaluate clients, and to evaluate any influences
on the relationship between clients and practitioners. Often when practitioners evaluate
clients, they base their evaluations of clients’ behaviors while in the presence of
practitioners as well as client reports of their behaviors in other settings. Practitioners can
also help clients to construct instruments that track clients’ progress on goals.
Anchors typically have the least desirable behavior on the left side of a continuum
and the most desirable behavior on the other. The items themselves can range from very
concrete to very general. When clients construct their own instruments, they also choose
their own treatment goals.
In group treatment in a woman’s prison, a woman wanted to stop threatening other
women when she felt they threatened her. She had some insight that her threatening
behavior was based on fear, beliefs, and trauma that she had experienced in the past.
Although
she
had
the
beginnings
of
an
understanding
of
the
complexity
of
her
issues
Gilgun letter
Page 7 of 9
related
to
threatening
others,
the
behavior
she
chose
to
monitor
was
specific
and
concrete.
This is the self‐constructed instrument she designed for herself.
When I felt threatened this past week I
The woman reviewed this simple scale at the beginning of each group. It provided a way
for her to focus on the behavioral manifestation of a complex issue. She or the group
facilitators could have made a rating scale out of this instrument, starting with 0 at “Hit” 5
for “Talked to Someone.” These scores could be graphed to show any changes over time.
Such graphing is not necessary. What is helpful is the focus that the simple scale provided
to the client and the deep roots of such a simple scales.
There are other ways to construct instruments tailored to particular clients and
practice settings. The references at the end of this essay provide more information.
Importance of Practitioner BuyIn
Direct practitioners will not use tools that do not help in their practice. If their
administrators insist they use tools that do not help them, they will comply with directives
to fill out the tools but the ideas of the tools may not have much effect. The chances for
practitioner buy‐in are increased when practitioners
• see the value of the tool for their practice effectiveness, such as helping them
set goals, give direction for interventions, and gauge progress on goals;
o see that the tools fit their practice. For example, if practitioners are
involved in dealing with crises and being concerned that clients do not
have
basic
life
skills
such
as
knowing
how
to
brush
their
teeth,
it
is
Gilgun letter
Page 8 of 9
unlikely
that
they
will
find
tools
to
be
helpful
when
the
tools
encourage skill development that is beyond what their clients are able
to attain;
• have input into the items of the instruments, how they use the instruments,
and whether and how the instruments are modified to better fit practice;
• have training on the ideas and concepts on which the tools are based;
• are not swamped with paperwork demands that they find cuts down on the
time they have for direct client contact.
These issues are stated in simple terms, but they are complex and require much
thought and planning on the part of administrators in consultation with front‐line
practitioners.
Discussion
This essay provides information on how to choose assessment and evaluation tools
in social work direct practice. Standardized and self‐constructed instruments have many
advantages, but social workers will not use them if they do not find the tools helpful.
Administrators have the responsibility to involve front‐line workers in the construction,
modification, and procedures for using instruments. They also must allow for training of
practitioners so that they have an appreciation of the research, theory, and practice
wisdom on which tools are based. Finally, practitioners require time to use the tools and to
interpret the information that the tools produce. If the practitioners experience the
instruments as add‐ons to an already heavy caseload and to which they have few if any
involvement and investment, the tools will be of little use.
At
their
best,
assessment
and
intervention
tools
provide
practice
guidelines
useful
in
Gilgun letter
Page 9 of 9
understanding
the
complexities
of
clients’
lives,
information
on
what
is
working
and
not
working, focus for clients on client‐selected goals, insight for practitioners on what they are
doing and how they can do better, and provide evidence that the efforts of social workers
have outcomes that can be shared with others. Funders prefer to sponsor programs that
show effectiveness.
References
APA
Taskforce
on
Evidence‐Based
Practice
(2006).
Evidence‐based
practice
in
psychology.
American
Psychologist,
61(4),
271‐185.
Gilgun,
Jane
F.
(2005).
The
four
cornerstones
of
evidence‐based
practice
in
social
work.
Research
on
Social
Work
Practice,
15(1),
52‐61.
Bloom,
Martin,
Joel
Fischer,
&
John
G.
Orme
(2009).
Evaluating
practice:
Guidelines
for
the
accountable
professional.
Boston:
Pearson.
Bordelon,
Thomas
D.
(2006).
A
qualitative
approach
to
development
an
instrument
for
assessing
MSW
students’
group
work
performance.
Social
Work
with
Groups,
29(4),
75‐
91.
Gilgun,
Jane
F.
(2005).
The
four
cornerstones
of
evidence‐based
practice
in
social
work.
Research
on
Social
Work
Practice,
15(1),
52‐61.
Gilgun,
Jane
F.
(2004).
Qualitative
methods
and
the
development
of
clinical
assessment
tools.
Qualitative
Health
Research,
14(7),
1008‐1019.
Mokuau,
Noreen
et
al
(2008).
Development
of
a
family
intervention
for
native
Hawaiian
women
with
cancer:
A
pilot
study.
Social
Work,
53(1),
9‐19.
Wenbron,
Jennifer
et
al
(2008).
Assessing
the
reliability
and
validity
of
the
Pool
Activity
Level
(PAL)
Checklist
for
use
with
older
people
with
dementia.
Aging
and
Mental
Health,
12(2),
202‐211.
About
the
Author
Jane
F.
Gilgun,
Ph.D.,
LICSW,
is
a
professor,
School
of
Social
Work,
University
of
Minnesota,
Twin
Cities,
USA.
See
Professor
Gilgun’s
other
articles,
children’s
books,
and
articles
on
Amazon
Kindle,
scribd.com/professorjane,
and
stores.lulu.com/jgilgun.